Raiders of the Lost Corpus

0
379

Coptic represents the last phase of the Egyptian language and is pivotal for a wide range of disciplines, such as linguistics, biblical studies, the history of Christianity, Egyptology, and ancient history. It was also essential for “cracking the code” of the Egyptian hieroglyphs. Although digital humanities has been hailed as distinctly interdisciplinary, enabling new forms of knowledge by combining multiple forms of disciplinary investigation, technical obtacles exist for creating a resource useful to both linguists and historians, for example. The nature of the language (outside of the Indo-European family) also requires its own approach. This paper will present some of the challenges -both digital and material -in creating an online, open source platform with a database and tools for digital research in Coptic. It will also propose standards and methodologies to move forward through those challenges. This paper should be of interest not only to scholars in Coptic but also others working on what are traditionally considered more “marginal” language groups in the pre-modern world, and researchers working with corpora that have been removed from their original ancient or medieval repositories and fragmented or dispersed. The dry desert of Egypt has preserved for centuries the parchment and papyri that provide us with a glimpse into the economy, literature, religion, and daily life of ancient Egyptians. During the Roman period of Egyptian history, many texts were written in the Coptic language. Coptic is the last phase of the ancient Egyptian language family and is derived ultimately from the ancient Egyptian hieroglyphs of the pharaonic era. Digital and computational methods hold promise for research in the many disciplines that use Coptic literature as primary sources: biblical studies, church history, Egyptology, linguistics, to name a few. Yet few digital resources exist to enable such research. This essay outlines the challenges to developing a digital corpus of Coptic texts for interdisciplinary research — challenges that are both material (arising from the history and politics of the physical corpus itself) and theoretical (arising from recent efforts to digitize the corpus). We also sketch out some solutions and possibilities, which we are developing in our project Coptic SCRIPTORIUM. Digital Humanities has defined itself as a field that can enable research on a new scale, whether distant reading of large text corpora, aggregation of large visual media collections, or enabling discovery in future querying and algorithmic research [Moretti 2013] [Greenhalgh 2008] [Witmore 2012]. Critical Digital Humanities scholars remind us that digitization initiatives sometimes replicate the Western canon rather than expand it, and that digitization is not in and of itself a more equitable mode of scholarship existing outside of politics [Wernimont 2013] [Wilkins 2012]. Digital tools and corpora for Coptic language and literature, we argue, can expand humanistic research not merely in terms of scale but also scope, especially in ancient studies and literature. Large English, Greek, and Latin corpora — as well as the tools to create, curate, and query them — have been foundational for work in the Digital Humanities. Computational studies on the documents from late antique Egypt can facilitate academic inquiry across traditional disciplines as well as transform our canon of Digital Classics and Digital Humanities scholarship. Part I: Shenoute of Atripe and the Scriptorium of Doom Of the several dialects of Coptic that developed in late antiquity, the Sahidic dialect is considered the early classical dialect. Much of the surviving Coptic literature in Sahidic comes from one important late antique repository: the White DHQ: Digital Humanities Quarterly: