Thank you to Amir and the staff at Georgetown University. Most of our public applications, such as the ANNIS database and the Coptic Dictionary Online, are back in service.
We are sorry to report that the server that hosts the Coptic Dictionary Online and Coptic Scriptorium’s ANNIS database are down. (Likewise some of the NLP tools and internal tools like GitDox are down.)
We are working on fixing the problem, but for now we do not have a timeline for when they will be up and running.
In the meantime reading and browsing texts at http://data.copticscriptorium.org still work.
Thank you for your patience! We will let you know when the systems are up again.
It is our pleasure to announce release 4.3.0 of Coptic Scriptorium corpora, which currently cover over 1,175,000 tokens of searchable, linguistically analyzed Coptic data from dozens of ancient Coptic works. New in this release:
Improvements and error corrections to a variety of works (including Because of You Too O Prince of Evil, Dormition of John, Book of Ruth and Homilies of Proclus)
The newly released material encompasses over 57,000 tokens of semi-automatically annotated data. We would like to give special thanks to the Marcion Project for making much of the underlying digitized text available, and the annotators whose hard work has made this release possible. As with all releases, raw machine readable data for all corpora can be found, including morphological and syntactic analysis, as well as named entity recognition and entity linking, on our GitHub repository, in a variety of popular formats:
We are always excited to see what kind of work people are doing with our project. Please get in touch if you’ve been using the dictionary or any of Coptic Scriptorium’s tools, corpora, annotations, etc., in your work!
The online dictionary is part of the KELLIA collaboration between Coptic Scriptorium (Georgetown University and the University of Oklahoma), the Berlin-Brandenburg Academy, the Goettingen Academy, the Free University in Berlin, and Goettingen University.
It is our pleasure to announce the latest data release from Coptic Scriptorium, version 4.2.0. This release contains both new Coptic material and additions to older datasets, as well as expanding our entity annotations and named-entity linking to all of our data, including the semi-automatically annotated Old Testament. The also means automatic updates to all of our interfaces, such as the recently added example usage functionality in the Coptic Dictionary Online, which is linked to the corpora.
The new material, including more digitized data courtesy of the Marcion project, as well as manually digitized and corrected OCR data from out of print editions includes:
More Apophthegmata Patrum (work by Christine Luckritz Marquis, So Miyagawa, Caroline T. Schroeder and Amir Zeldes)
Further material from Shenoute’s works:
God Says Through Those Who Are His (including parallel witnesses and new material, data courtesy of David Brakke, annotations by Rebecca Krawiec, Lance Martin, Dana Robinson, Caroline T. Schroeder)
Acephalous Work 22 (data courtesy of David Brakke, annotations by Elizabeth Davidson, Rebecca Krawiec, Elizabeth Platte, Caroline T. Schroeder, Amir Zeldes)
More syntactically annotated gold treebanked data in the Coptic Treebank
Completely re-annotated Old Testament corpus, based on the base text courtesy of the Digital Edition of the Coptic Old Testament (CoptOT) project – with improved segmentation and parsing, now complete with semi-automatic entity recognition and linking to Wikipedia entries for people and places
With this new release, the semi-automatically annotated data (excluding automatically processed Bible materials) in the project covers close to 300,000 words of Sahidic Coptic annotated for entities.
This release represents a tremendous amount of work over the past few months by the Coptic Scriptorium team. We would also like to thank individual contributors (which you can always find in the ‘annotation’ metadata for each document), and specifically So Miyagawa for help with Coptic OCR models, as well as the Marcion and CoptOT project for sharing their data with us, and the National Endowment for the Humanities for supporting us. We are continuing to work on more data, links to other resources and new kinds of annotations and tools. Please let us know if you have any feedback!
We are pleased to announce the latest release of data from Coptic Scriptorium, version 4.1.0. The new release adds new Coptic texts and annotation additions, underscored by the application of named and non-named entity annotation to our New Testament corpus.
In total, we released approximately 40,000 tokens of manually edited text in 17 documents from new works, as well as adding material to already existing works. The new material, including more digitized data courtesy of the Marcion project, the Kyprianos Magical Text Database, and other scholars, includes:
Life of John the Kalybites, parts 1 and 2 (annotations by Lance Martin, Tamara, Siuda, and Caroline T. Schroeder)
We are especially excited to announce the first release of several magical papyri and an ostracon on the Coptic Scriptorium platform in collaboration with the Kyprianos team at the University of Würzburg:
Magical Texts (Korshi Dosoo, Edward O. D. Love, Markéta Preininger, Lance Martin, Caroline T. Schroeder, and Amir Zeldes)
Expansions and Improvements of existing corpora:
Apa Johannes Canons (Diliana Atanassova, Caroline T. Schroeder, Lance Martin, and Amir Zeldes)
Apophthegmata Patrum (Marina Ghaly, Christine Luckritz Marquis, Caroline T. Schroeder)
We have extended our semi-automatic entity annotation coverage to encompass our New Testament material (over 248,000 tokens). Entity annotations, like our other annotations, were added to these specific corpora automatically and include:
The classification of all non-pronominal references to people, places and other entities into 10 entity categories
Entity linking:
Linking of all named entities which have corresponding Wikipedia articles to their respective Wikipedia entries, including geo-location information where available
This addition complements the existing named and non-named entity annotations of our entire collections of Coptic corpora.
We would also like to thank individual contributors (which you can always find in the ‘annotation’ metadata for each document), each of whom put in a colossal amount of work, and the Marcion and Kyprianos projects who shared their data with us, as well as the National Endowment for the Humanities for supporting us. We are continuing to create more data and tools. Please let us know if you have any feedback!
The “Thesaurus Linguae Aegyptiae” project (“Strukturen und Transformationen des Wortschatzes der ägyptischen Sprache”, BBAW), the “Database and Dictionary of Greek Loanwords in Coptic” (DDGLC, Freie Universität Berlin), and “Coptic Scriptorium: Digital Research in Coptic Language and Literature” are pleased to announce the latest release of the “Comprehensive Coptic Lexicon”: Version 1.2. The raw data can be downloaded from:
D. Burns, F. Feder, K. John, M. Kupreyev, et al. 2020-07-24. Comprehensive Coptic Lexicon: Including Loanwords from Ancient Greek, Berlin: Freie Universität Berlin, http://dx.doi.org/10.17169/refubium-27566
The processed data has been published by the
Coptic Dictionary Online, ed. Koptische/Coptic Electronic Language and Literature International Alliance (KELLIA), https://coptic-dictionary.org/
The major new features include:
Standardized use of parentheses “( )” in word forms.
Optimizated data structure (e.g., <sense/> element now contains a unique ID, facilitating the ongoing work on linking CCL to the databases of semantic relations such as Coptic WordNet).
Correction of orthographic, grammatical and semantic information of the existing entries and addition of new entries.
Linking to Perseus Greek morphology tool via the Greek head words. DDGLC lemma IDs are now displayed in the entry view of Coptic Dictionary Online.
Improved usability of the section of Greek loanwords due to exclusion or change of a number of senses.
Link to attestation search for nouns filtered by entity-type (e.g., search for ⲟⲩⲟⲛ standing for a person, an animal, or an inanimate object) in Coptic Scriptorium.
Phrase network visualization of most common word sequences containing nouns, verbs and prepositions.
The Comprehensive Coptic Lexicon V 1.2 now contains 11263 entries and 31847 forms of Egyptian-Coptic and Greek-Coptic datasets. TLA, DDGLC and Coptic Scriptorium invite you to take a look at the new data and would welcome your feedback.
It is our great pleasure to announce the latest release of data from Coptic Scriptorium, version 4.0.0. This release contains both new Coptic material and extensive additions to our suite of tools and annotations, focusing on the addition of support for entity annotation and named-entity linking across our new and old datasets. The new material, including more digitized data courtesy of the Marcion project and other scholars, includes:
John of Constantinople, on Penitence and Abstinence (annotations by Mitchell Abrams, Lance Martin and Amir Zeldes)
Pseudo-Chrysostom: (Elizabeth Davidson, Mitchell Abrams, Lance Martin, Amir Zeldes)
More Apophthegmata Patrum (Hayley Curtis, Elizabeth Davidson, Duncan Feiges, Elizabeth Platte, Caroline T. Schroeder, Amir Zeldes)
Further material from Shenoute’s works:
God Says Through Those Who Are His (including parallel witnesses and new material, data courtesy of David Brakke, annotations by Rebecca Krawiec, Lance Martin, Dana Robinson, Caroline T. Schroeder)
Some Kinds of People Sift Dirt (data courtesy of David Brakke, annotations by Christine Luckritz Marquis, Caroline T. Schroeder, Amir Zeldes)
With this new release, the semi-automatically annotated data (excluding automatically processed Bible materials) in the project covers close to 260,000 words of Sahidic Coptic annotated for entities, including 50,000 words of gold-standard treebanked data with manual syntactic analyses.
In addition to new texts, new tools and analyses have been added to the project:
Complete entity annotation, classifying all non-pronominal references to people, places and other entities into 10 entity categories
Entity linking:
Linking of all named entities which have corresponding Wikipedia articles to their respective Wikipedia entries, including geo-location information where available
A browseable index of people and places mentioned in the texts, also linked to Wikipedia and Google Maps and including both real and fictional entities
A new neural parser adapted for Coptic with higher accuracy syntactic analyses, which are deployed in ANNIS (work by Luke Gessler)
This release represents a tremendous amount of work over the past few months by the entire Coptic Scriptorium team. We would also like to thank individual contributors (which you can always find in the ‘annotation’ metadata for each document) and the Marcion and PAThs projects who shared their data with us, and the National Endowment for the Humanities for supporting us. We are continuing to work on more data, links to other resources and new kinds of annotations and tools. Please let us know if you have any feedback!
The program for the third edition of Digital Coptic is now online. Check out the workshop website for the list of projects, talks and presenters.
Please join us for the workshop on July 12 and 13 – participants will receive a Zoom link and password for interactive presentations and discussion, and the workshop will also be cast to YouTube for larger audiences and offline viewing after the workshop. We look forward to hearing all the talks!