Anthology of Computers and the Humanities · Volume 4

Aligner méthode historique et RAG : transformer un assistant conversationnel en chaîne de preuve auditable et discutable

Marie Puren1,2 , Donghan Bian1,2 , Aurélien Pellet1 , Julien Perez1 and Florian Cafiero1

  • 1 Laboratoire de recherche d’Epita (LRE), EPITA, Le Kremlin-Bicêtre, France
  • 2 Centre Jean-Mabillon, Ecole nationale des chartes, Paris, France

Permanent Link: https://doi.org/10.63744/G14mF7UizFzS

Published: 21 May 2025

Keywords: history, retrieval augmented generation

Mots clés : histoire, génération augmentée par récupération

Abstract

This article examines the challenges raised by deploying Retrieval-Augmented Generation (RAG) systems for the exploration of digitized historical sources. Starting from the observation that disciplinary acceptance remains fragile, it asks the following question: how can a RAG system applied to noisy and heterogeneous archives ensure conditions of verification and critique that are compatible with the historical method? Presented as a position paper, this paper offers a conceptual framing and preliminary directions for guiding the development of RAG devices aligned with these requirements. It argues for restoring control over the interpretive chain by articulating three conditions: traceability (the ability to locate documents and passages precisely), auditability (the ability to inspect the transformations and parameters throughout the pipeline), and discussability (the ability to subject statements to debate by distinguishing evidence from interpretation). Its main contribution is an auditability framework that translates historians’ requirements into instrumented conditions: (1) documentary anchoring (provenance and integrity), (2) explicit separation of quotation, paraphrase, and inference, (3) restoration of context and plurality of sources, (4) traceability of execution conditions and error diagnosis (retrieval vs. generation), and (5) abstention mechanisms when the evidence is insufficient.