Call for Applications CLARIN-EHRI Workshop | "Natural Language Processing Meets Holocaust Archives"

Monday, 4 December, 2023

Hands-on Workshop CLARIN & EHRI | Call for applicants

March 27-28, 2024 | Location: Prague, Czech Republic | Deadline: 15 January 2024

Language is central to the study of the Holocaust. The documentation of the genocide of Jews and Roma, building on war-time clandestine efforts, has grown since the liberation and today includes large corpora of testimonies, trial proceedings, letters and diaries, government documents, files of aid organisations and much more.

The rich textual resources accumulated by Holocaust archives translate into an increasing amount of research data which can be analysed with natural language processing techniques.

At the same time, the transnational character of the Holocaust, the fragmentation of its archival record, as well as its multilinguality offer both opportunities and challenges in the application of computational linguistics.


This interactive, hands-on workshop is a cooperation between two transnational European research infrastructures, bringing language technology together with Holocaust archives and data. The Common Language Resources and Technology Infrastructure (CLARIN) develops and supports state-of-the-art language tools and datasets as well as expertise on language technologies. The European Holocaust Research Infrastructure (EHRI) provides access to information about dispersed Holocaust archives and supports the Holocaust research community. Both infrastructures aim to empower researchers who tackle the challenges of the digital turn in the humanities and social sciences. Since the Czech node of EHRI is a part of the broader LINDAT/CLARIAH-CZ research infrastructure, Prague is a natural venue for discussing connections and cooperations. 


The organisers invite proposals from projects and researchers who want to engage in interdisciplinary exchange of experience and methods. Proposals will be organized on the basis of common topics or methods into presentations, hands-on demos, tutorials, and panel discussions, as appropriate. We welcome experts from the field of computational linguistics as well as Holocaust researchers, projects and archives keen to test new approaches on their data and against their research questions.

Information on the CLARIN website about the event can be found here.

Possible themes:

  • Optical character recognition, especially handwriting recognition; working with multilingual and imperfect output typical for archival holdings
  • Information Extraction techniques including named entity recognition (identifying persons, events, places and other spatial entities in texts) and entity linking (linking named entities to unique identifiers in controlled vocabularies and knowledge bases); metadata indexing with the help of controlled vocabularies using large language models
  • Curation and encoding of transcribed oral testimonies, and of textual resources for the field of Holocaust Studies: corpus building, annotation, standards, interoperability and automation
  • Machine translation, summarisation, new approaches using AI

If you would be interested in attending this workshop, please apply using the application form by January 15, 2024. There will be the opportunity for invited attendees and successful applicants to apply for travel and accommodation costs. We will inform you of the outcome of your application by February 15.

Image by CLARIN