Anthology of Computers and the Humanities · Volume 3

Classification of Script Types and Modes for Medieval Hebrew Manuscripts

Daria Vasyutinsky-Shapira1 ORCID , Irina Rabaev2 ORCID , Jihad El-Sana3 ORCID , Ophir Münz-Manor4 ORCID and Author Three2

  • 1 School of Computer Science and AI, Tel Aviv University, Ramat Aviv, Israel
  • 2 Department of Software Engineering, Shamoon Academic College of Engineering, Beer-Sheva, Israel
  • 3 Department of Computer Science, Ben Gurion University of the Negev, Beer-Sheva, Israel
  • 4 Department of History, Philosophy and Judaic Studies, Open University of Israel, Raanana, Israel

Permanent Link: https://doi.org/10.63744/kHC6jQFdCkNz

Published: 21 November 2025

Keywords: digital paleography, deep-learning models, medieval Hebrew manuscripts

Abstract

This paper presents an overview of a few years of work on the automatic classification of types and modes for medieval Hebrew script performed at the VML Lab at the Ben Gurion University. We introduce here a new type of paper, a story of how a multidisciplinary team of researchers and students, some leaving in the process and some joining, worked together to solve a challenging problem, interesting as a Computer Science project, and essential for the Humanities research. Our research is pioneering, and it took years of trying and improving. This paper is addressed to a Digital Humanist interested in following the latest advancements in digital Hebrew paleography; it references more technical parts of this work. The resulting algorithms and the datasets that were produced in the process are an essential contribution to the automatic layout detection, segmentation, and, eventually, automatic transcription for Hebrew manuscripts.