Anthology of Computers and the Humanities · Volume 3

Mind the Language Gap in Digital Humanities: LLM-Aided Translation of SKOS Thesauri

Felix Kraus1 ORCID , Nicolas Blumenröhr1 ORCID , Danah Tonne1 ORCID and Achim Streit1 ORCID

  • 1 Scientific Computing Center, Karlsruhe Institute of Technology, Karlsruhe, Germany

Permanent Link: https://doi.org/10.63744/a98mEv2X4ugw

Published: 21 November 2025

Keywords: Translation System, Large Language Models, SKOS, Thesaurus, Multilingual

Abstract

We introduce WOKIE, an open-source, modular, and ready-to-use pipeline for the automated translation of SKOS thesauri. This work addresses a critical need in the Digital Humanities (DH), where language diversity can limit access, reuse, and semantic interoperability of knowledge resources. WOKIE combines external translation services with targeted refinement using Large Language Models (LLMs), balancing translation quality, scalability, and cost. Designed to run on everyday hardware and be easily extended, the application requires no prior expertise in machine translation or LLMs. We evaluate WOKIE across several DH thesauri in 15 mainly European languages with different parameters, translation services and LLMs, systematically analysing translation quality, performance, and ontology matching improvements. Our results show that WOKIE is suitable to enhance the accessibility, reuse, and cross-lingual interoperability of thesauri by hurdle-free automated translation and improved ontology matching performance, supporting more inclusive and multilingual research infrastructures.