Embedded in the Labyrinth: Investigating Latin Word Senses through
Transformer-Based Contextual Embeddings and Attention

Kaše, Vojtěch; Lang, Sarah; Pavlas, Petr

doi:10.63744/FuaAvdPMdtwW

Abstract

This paper explores how transformer-based models can enhance historical keyword-in-context studies through automatic word sense disambiguation (WSD). Using the Latin term labyrinthus as a case study, we analyze its contextual meanings across time and genre within the GreLa corpus. A Large language model provides preliminary sense labels, which we use to evaluate 64 embedding variants—contextual, attention-based, and co-occurrence-based—derived from XLM-R and Latin BERT. Our results show that combining embedding types yields the best performance. We also illustrate how attention-based embeddings capture meaningful diachronic patterns, offering promising directions for future research on semantic change and metaphor in historical texts.

Embedded in the Labyrinth: Investigating Latin Word Senses through Transformer-Based Contextual Embeddings and Attention