The Mixtec codices are precolonial and early colonial Mesoamerican manuscripts that use a semasiographic writing system to record historical, genealogical, and cosmological information. Decoding these manuscripts is uniquely challenging due to their non-linear structure and symbolic complexity. In this work, we introduce a novel application of Vision Transformers (ViTs) to classify key elements of Mixtec writing. Our classification pipeline involves a type classifier to distinguish between year and name-date symbols, followed by a symbol classifier to identify calendrical signs, and finally, a numerical bead counter. This pipeline tackles the challenge of high intra-class variability in hand-drawn symbols and separates the complex, often overlapping symbolic glyphs from their numerical components. Our results show a surprising dichotomy: ViTs achieve an F1 score greater than 0.9 in symbol classification but struggle with the counting task, where the F1 score is about 0.22. This contrast highlights a core architectural trade-off in Vision Transformers: their global attention mechanism helps with holistic pattern recognition, but it hinders fine-grained spatial localization. This insight clarifies both the potential and limitations of using ViTs to decode semasiographic texts, which can help guide more targeted applications in cultural heritage preservation. Our code is available here: https://github.com/ufdatastudio/mixtec-namedate-classifiers
