| CARVIEW |
The Flow
From Deep-Learning to Digital Analysis and their Role in the Humanities Creating, Evaluating, and Critiquing Workflows for Historical Corpora
Historical research increasingly makes use of digitization, benefiting from advancements in Handwritten Text Recognition and Natural Language Processing. This is where The Flow comes in and aims to promote the use of digital methods by developing a workflow that can be used by historians without expertise in information science and coding.
The Flow, running from 2023 to 2026, strives to create standardized digital workflows using existing technology, facilitating easier digital work with premodern historical sources.
The Flow is a joint project of the DH departments of the Universities of Bern and Bielefeld and the Research Centre for Hanse and Baltic History in Lübeck. The project is funded by the Swiss National Science Foundation (SNF) and the German Research Foundation (DFG).
The Project
The Flow - Advancing Historical Text Recognition and Analysis
Digitization is now an integral part of historical research. Historians and the humanities in general have much to gain from advances in machine-learning approaches that can transform text processing capabilities. However, their full potential remains largely untapped within the humanities. The tools and workflows to make use of new technologies are often only available to those who have a deep understanding of information science. Our project aims to bridge this gap by facilitating the widespread adoption and critical use of machine-learning technologies.
The Flow seeks to develop user-friendly digital workflows, making these powerful tools accessible to researchers beyond the realm of data scientists and coding experts.The project “The Flow” will develop more standardized digital workflows based on existing technology, making it easier for researchers to work with historical sources digitally. Through semi-automated processes and workflows established in the project, historians will be able to study longer time periods and gain a layered reading-based understanding of larger corpora of pre-modern manuscripts.
Sub-Projects
Four subprojects contribute to the project’s overall goal by analysing legal and administrative sources spanning different periods and regions, including England (13-14th centuries), the northern European Hanse area (14th-17th centuries), Switzerland (16-18th centuries), and Ethiopia (19th century).
The sources will be studied with digital methods, namely the use of (newly created) Handwritten Text Recognition and Natural Language Processing models for historical languages. Through a praxeological and institutional framework, our project tackles complex research questions surrounding societal processes, the practice of law, and its impact on everyday life.
A Digital Analysis of Common Law: Social Dynamics in Early English Justice
Essoins – Medieval English Court Rolls
The team at Bielefeld University is particularly interested in analysing the court rolls of the early English Common Law. There are two main aspects to the sub-project: 1. Digitization of different types of court rolls of the 13th and 14th centuries using HTR and OCR models; 2. Analysis of the specific role of essoins within the Common Law focusing on the impact of social categorization on legal procedure using NLP tools.
As part of the FLOW project, Bielefeld will contribute ground truth data to improve the language models used in HTR procedures for the Anglicana script used in the court rolls. We evaluate existing tools and standards for medieval HTR to make it easier for historians to integrate digital methods into their workflow.
On the basis of this, the Bielefeld team will try to enhance the output of NLP tools for the detection of named entities and employ this technology in the analysis of the social relations and structures imminent at court. A special focus will lie on questions of intersectionality and group formation as it is present in the court proceedings.
Navigating the Hanse: A Long-Term Analysis of Hanse Diets and Policy Making
Hanserecesse – Protocolls and Resolutions of Urban Meetings (1358-1669)
The project carried out at the Research Centre for Hanse and Baltic History (FGHO) in Lübeck delves into the extensive sources on the inner workings of the German Hanse, a voluntary association of merchants and towns shaping economic and political landscapes for centuries. The hanserecesse, spanning from 1358 to 1669, document the common voluntary decision making of towns located in the vast area from today‘s Netherlands to Estonia. These records covering about three centuries form the central source corpora on the history and development of the German Hanse.
Within The Flow, we examine the recesse with the help of digital applications and thus present a long-term study of the development of the German Hanse for the first time. Next to this study, we are working on the creation of suitable HTR and NER models for Low German manuscripts in order for other research to have easier to access the extensive source material of Hanse towns and their collaboration.
The Construction of the Sulṭān's Authority in the Arab Provinces: Sharia, Law Practices and Institutionalization in Jerusalem
sijillāt maḥkama sharʿīyya – Sijills. Shari‘a Court Records (16th and 17th centuries)
This sub-project focuses on the records (sijills) of the Shari‘a courts of Ottoman Jerusalem. These records, spanning about three centuries, constitute a major corpus of sources for the legal history of the Ottoman periphery. The sub-project analyses criminal cases, particularly those based on customary law, contributing to the debate on how law was understood and applied by various actors (judges, litigants, court officials) and how the Ottoman Sultan intervened in the lawmaking process. It also explores what law can reveal about social practices in contexts considered marginal to the central domains of the Ottoman Empire, especially the Arab province of Palestine, and contributes to the development of machine-learning-based workflows for sources in Arabic and Ottoman Turkish.
The project will construct a data set for the judicial corpora of Ottoman Jerusalem. Based on this data set, it will explore whether the frequency of particular kinds of cases or features of cases can be contextualized, providing insights into the underlying dynamics of Jerusalem’s pluralistic legal system. After an initial phase, in which text recognition using HTR models for the Arabic and Ottoman Turkish script used in the sijills is developed, the project will focus on the large-scale comparison of single criminal cases using NLP tools. This will advance our knowledge of a lesser-known chapter of Jerusalem’s legal history, offering us a glimpse of the transformation of judicial practices, and their complex interactions with social reality, over an extended period.
Analyzing Delinquency Cases in the Tower of Bern
Tower Books – Bernese Interrogation Protocols (1545-1745)
The sub-project focuses on questions of categorization of different cases brought to trial (and, in most cases, involving “peinliche Befragung,” that is, torture) in the tower of Bern. The project inquires if, from a textual perspective, frequency and type of delinquency can be contextualized, leading to insight into the inner workings of the legal system of one of the major city-states north of the Alps. The corpus-level perspective is deliberately chosen in order to avoid modern categorizations and to bring into focus instead the commonalities of single cases recorded in the tower books. This perspective takes discussions of alterity seriously that ask for an understanding of pre-modern pasts that is not biased by modern language. At the same time, it is coupled with recent approaches to the linguistic analysis of written documents.
Technically, we will, after an initial phase involving text recognition, focus on large-scale comparison of single cases. The tower books will be made accessible in full-text, and by topic occurrence via adapting existing methods. Topic modeling and comparisons of vectorized text parts (understood as an NLP task), in particular, have so far only infrequently been used for pre-modern textual data that has not been normalized. The labeling of similar cases, yielding categorization, will be based on the close reading of single, “typical” (statistically determined) cases. This will show which cases are highly correlated, and thus could potentially involve similar offenses based on what accusations were brought.
Activities
Our past and upcoming activities
8 -10 October 2025
Panel ‘Arabic Pasts: Histories and Historiography’ and Digital Method Workshop
Sefer Korkmaz, Ilyes Mechentel
London, AKU-ISMC
-
09 - 12 September 2025
Summer School: Building Workflows in the Digital Humanities
The Flow Team
01 - 04 September 2025
Summer School: Digital Humanities for Islamic Studies – Introduction, Exchange, and Hands-on
Sefer Korkmaz
Bern, University of Bern
-
28 July 2025
Hanse goes Digital: Ein Mixed Methods Ansatz zur digitalen Analyse hansischer Privilegienpolitik in HTR-transkribierten Hanserezessen
Angela Huang, Inga Lange
Sommerschule Digital History. Werkzeuge für Quellenerschließung und Textanalyse, Kiel University, online
07 - 10 July 2025
Session 544 ‘Digital Data Flows: Processing Medieval Documents from England, Berne, and the Hanse’
Tobias Hodel, Silke Schwandt, Christopher Kuhlmann, Inga Lange, Dominic Weber
IMC Leeds, Leeds University
-
30 June 2025
Panel: Text, Technology, and Tradition: Leveraging Digital Methods in Islamic Research, BRAIS 2025: Annual Conference of the British Association of Islamic Studies
Sefer Korkmaz
Cambridge, University of Cambridge
12 - 13 June 2025
Paper Presentation “Go with the Flow: Wie Historiker:innen serielles Schriftgut digital auswerten (können)”
Angela Huang & Silke Schwandt
Workshop “Künstliche Intelligenz und historische (Justiz-)-Forschung”, Justus-Liebig-Universität Gießen
-
3 - 6 June 2025
Microservice-Based Data Management and Processing: A Modular Workflow for Automatic Text Recognition and Beyond
Dana Meyer & Jonas Widmer
15 May 2025
Hanse goes digital: (halb)automatisierte Handschriftenerkennung und digitale Auswertung von Hansequellen - Erfahrungen und Perspektiven
Angela Huang
University of Bamberg (online seminar)
-
14 May 2025
Hanse goes digital: Potentiale für die Hanseforschung durch digitale Workflows und Methoden
Angela Huang
Historisches Kolloquium der Technischen Universität Braunschweig
27 - 28 March 2025
Enhancing Named Entity Recognition for Digital History: Towards Explainable and Trustworthy Models
Dominic Weber
Conference “Words in Numbers. Data-Driven Approaches to Texts in the Humanities and Social Sciences”, Ruhr-Universität Bochum
-
26 - 29 March 2025
The Hanse and its privileges: A Long-Term Perspective through Digital Methods
Inga Lange
European Social Science History Congress, Universiteit Leiden
11 - 12 February 2025
PhD Workshop: Method Criticism and Reflection
Mathew Barber (Aga Khan University, ISMC London), Moritz Feichtinger (University of Basel), Andreas Kuczera (Technische Hochschule Mittelhessen), Ina Serif (University of Basel), Sefer Korkmaz, Christopher Kuhlmann, Inga Lange, Dominic Weber
Online
-
05 February 2025
Workshop: “Archive im digitalen Zeitalter: Wie (halb)automatisierte Handschriftenerkennung unseren Zugang zu Quellen verändern kann”
Bart Holterman
Schleswig, Kreis- & Stadtarchiv Kulturstiftung des Kreises Schleswig-Flensburg
23 - 24 January 2025
Workshop “BeNASch: Praktische Einblicke und Diskussion. Ein Workshop zur nachhaltigen Annotation von Entitäten und Ereignissen in historischen Texten”
Ismail Prada Ziegler (Economies of Space) & Dominic Weber
University of Bern
-
3 - 4 October 2024
Workshop “Identifying Textual Reuse in Ottoman Fatwas: Applying Novel Methodologies in Ottoman Legal Historiography”
Sefer Korkmaz
Arabic Pasts 2024: Histories and Historiography, London, Aga Khan University, Institute for the Study of Muslim Civilisations (AKU-ISMC) and SOAS, University of London
20 September 2024
Paper Presentation “Faktizität und historiographische Autorität von Machine Learning Output”
Dominic Weber
Digital History and Citizen Science Conference, Martin-Luther-Universität Halle-Wittenberg
-
12 September 2024
Paper Presentation “On the Historiographic Authority of Machine Learning Systems”
Dominic Weber
Digital History Switzerland 2024, University of Basel
26 - 28 August 2024
Summer School “A digital Workflow for Historical Corpora - from HTR to NER”
The Flow Team
Online / Research Centre for Hanse and Baltic History, European Hansemuseum Lübeck
-
24 April 2024
Paper Presentation “Archive im digitalen Zeitalter: Wie (halb)automatisierte Handschriftenerkennung unseren Zugang zu Quellen verändern kann”
Angela Huang & Vivien Popken
Schleswig-Holsteinischer Archivtag 2024, Nordkolleg Rendsburg
22 April 2024
Poster Presentation “Go with the Flow. Towards Reusable Workflows”
Dominic Weber & Jonas Widmer
Phil.-hist. Forschungstag 2024, University of Bern
-
26 February 2024
Workshop “not opaque flow - Workflows zur Aufbereitung und Auswertung historischer Dokumente”
The Flow Team, Patrick Jentsch & Inga Kirschnick (both nopaque)
DHd2024 Conference, University of Passau
28 - 30 August 2023
Workshop “The Flow - Kick-off”
The Flow Team
University of Bern
The Team
Meet the team behind The Flow
Tobias Hodel
Project Lead, University of Bern
Tobias is tenure track assistant professor in digital humanities at the University of Bern since 2019. He researches and teaches machine learning methods in and for the humanities. This includes the automated recognition of historical manuscripts, the extraction of information and the development of specific language models. Hodel holds a doctorate in history and leads research projects at the University of Bern, including on the tower books of the city of Bern from the early modern period, chat systems for university didactics in the 21st century and the historical telephone directories of Switzerland.
Angela Huang
Project Lead, Research Centre for Hanse and Baltic History, European Hansemuseum Lübeck
Angela leads the Research Centre for Hanse and Baltic History at the European Hansemuseum in Lübeck. Her research has long focused on the history of the German Hanse. With the Lübeck sub-project, she hopes to be able to provide other (Hanseatic) historians with digital tools and methods for their work.
Silke Schwandt
Project Lead, Bielefeld University
Silke is professor of Digital History at Bielefeld University. Her research focus lies on quantitative text analysis with a specialization in medieval history. She also works on the impact of digitality on the methodologies and theories of history as a humanities discipline.
Serena Tolino
Project Lead, University of Bern
Serena is associate professor of Islamic and Middle Eastern Studies at the University of Bern since 2020. She researches and teaches on the history of the Middle East, Islamic law, history of gender and sexuality, slavery, strong asymmetrical dependencies and LGBTQI+rights in the Middle East. She leads several research projects at the University of Bern, including a project on slavery in the Islamic legal sources and one on the history of labour.
Sefer Korkmaz
PhD Researcher, University of Bern
Sefer has been part of The Flow since September 2024. After working extensively with Ottoman legal sources in different phases of his academic journey, he is now developing a novel perspective on the study of Ottoman legal historiography. As part of “The Flow” project, Sefer uses digital tools and machine learning to analyze Arabic & Ottoman Turkish records (sijills) of the Shari‘a courts of Ottoman Jerusalem. He ultimately aims to deepen his interdisciplinary knowledge, believing that integrating multiple disciplines will help him better understand the institutional framework of Ottoman Jerusalem and its complex historical, legal, social, and cultural dynamics.
Christopher Kuhlmann
PhD Researcher, Bielefeld University
Christopher has been part of The Flow since July 2023 with the aim to pursue his PhD in it. His main research focus is on ‚essoins‘ - excuses for not appearing in court in English common law. He analyses court rolls of the 13th and 14th centuries with a focus on group influences and affiliations, as well as intersectionality, using customised HTR and NLP models.
Inga Lange
PhD Researcher, Research Centre for Hanse and Baltic History, European Hansemuseum Lübeck
Inga is in charge of the Lübeck sub-project on The Flow. During her studies, she already worked extensively with manuscript sources from the pre-modern period. In the project, she is interested in analysing the Hanserezesse in detail as one of the most important sources of Hanse history. By using digital methods, she hopes to conduct a long-term study on the development of the German Hanse for the first time.
Dominic Weber
PhD Researcher, University of Bern
Dominic is part of The Flow since July 2023. In his PhD project he works with the Bernese Tower Books (interrogation protocols from the early modern period). He employs various machine learning applications for handwritten text recognition, information extraction and document clustering. Simultaneously, he theorises the epistemological consequences of conducting historical research based on data generated by machines and humans alike.
Dana Meyer
Developer, Bielefeld University
Dana studies Cognitve Informatics at Bielefeld University. She has been part of The Flow project since July 2023. She is interested in machine learning in the context of natural language processing.
Jonas Widmer
Developer, University of Bern
Jonas has been working for the Digital Humanities at the University of Bern since February 2021. As a data scientist he supports The Flow since July 2023, with focus on Natural Language Processing and Handwritten Text Recognition. He is developing and providing digital services/tools for the work with digital sources.
Anna Funk
Student Assistant, University of Bern
Anna is a student assistant for the Digital Humanities in Bern and has been working on The Flow since July 2023. She is responsible for data generation and pre-processing. Based on the source corpus of The Flow - the interrogation records of the Bernese Tower Books - she investigated witch trials in early modern Bern in her bachelor’s thesis.
Bart Holterman
Researcher, Research Centre for Hanse and Baltic History, European Hansemuseum Lübeck
Bart is a historian of the late Middle Ages and early modern period, with a specialization in economic history of the Hanseatic region and the application of digital methods. Employed at the FGHO since June 2024, he will assist in transcribing the Hanserezesse for training HTR models.
Mohamed Ilyes Mechentel
Researcher, University of Bern
Ilyes holds a Master’s degree in Digital Technologies Applied to History from the École des Chartes, where he focused on the scientific challenges of processing historical sources through digital technologies. During his studies, he completed an internship at the Bibliothèque universitaire des langues et civilisations (BULAC), working with digital tools applied to Arabic texts. He also interned at the Louvre Museum, where he explored the application of HTR technologies to 19th-century sources. Additionally, he contributed to the CallFront project at the Institut National d’Histoire de l’Art, which aims to collect and analyze scriptural data from the Islamic world. He is currently a Research Associate as Digital Humanities expert in Islamic Studies at the University of Bern.
Melvin Wilde
Student Assistant, Bielefeld University
Melvin works for Bielefeld University as a student assistant in The Flow. There he mostly handles the technical side of historical work such as annotations, guidelines, transcriptions and data management. He specializes in medieval history with a focus on gender/ masculinity studies.
Alumni
Former team members of The Flow. Thank you for your contribution <3