Author: ctschroeder (page 1 of 4)

New links for tools and services

After our recent server outage, we’ve been re-installing our tools and software. Some of our services are now available at new URLs.

The ANNIS database is now at https://annis.copticscriptorium.org/annis/scriptorium

Our Sahidic Coptic natural language processing tools are at https://tools.copticscriptorium.org/coptic-nlp

Our GitDox annotation tool is at https://tools.copticscriptorium.org/gitdox/scriptorium

The Coptic Dictionary online is still at https://coptic-dictionary.org, and our tool for browsing and reading texts is still at https://data.copticscriptorium.org

Thanks for your patience!

Coptic Scriptorium services are back online!

Thank you to Amir and the staff at Georgetown University. Most of our public applications, such as the ANNIS database and the Coptic Dictionary Online, are back in service.

Coptic Dictionary and ANNIS database down

We are sorry to report that the server that hosts the Coptic Dictionary Online and Coptic Scriptorium’s ANNIS database are down. (Likewise some of the NLP tools and internal tools like GitDox are down.)

We are working on fixing the problem, but for now we do not have a timeline for when they will be up and running.

In the meantime reading and browsing texts at http://data.copticscriptorium.org still work.

Thank you for your patience! We will let you know when the systems are up again.

Example of research using the online Coptic Dictionary: standalone G Thomas transcription

Martijn Linssen, an independent researcher, has been working on the Gospel of Thomas for some time and recently published a stand-alone “interactive Coptic-English translation” of the Gospel of Thomas on his Academia.edu site. The Coptic is linked to entries in the online Coptic Dictionary! We invite you to check it out!

We are always excited to see what kind of work people are doing with our project. Please get in touch if you’ve been using the dictionary or any of Coptic Scriptorium’s tools, corpora, annotations, etc., in your work!

The online dictionary is part of the KELLIA collaboration between Coptic Scriptorium (Georgetown University and the University of Oklahoma), the Berlin-Brandenburg Academy, the Goettingen Academy, the Free University in Berlin, and Goettingen University.

New release of Natural Language Processing Tools

Amir Zeldes and Luke Gessler  have spent much of the past summer improving Coptic Scriptorium’s Natural Language Processing tools, and are now happy to announce the release of Coptic-NLP V3.0.0. You can read more about what we’ve been doing and the impact on performance in our three part blog post (part 1, part 2, part 3). Some of the new improvements include:

  • A new 3 step normalization framework, which allows us to hypothetically normalize bound groups before deciding how to segment them, then normalize each segment again
  • A smart rebinding module which can handle deciding to merge split bound groups based on context (useful for processing messy texts with line-breaks mid word, or other segmentation anomalies)
  • A re-implemented segmentation algorithm which is especially better at handling ambiguous groups in context (e.g. “nau” in “peja|f na|u” vs. “nau ero|f”) and spelling variation
  • A brand new, more accurate part of speech tagger
  • Higher accuracy across tools thanks to hyperparameter optimization
  • More robust test suite to ensure new errors don’t creep in
  • Various data/lexicon/ruleset improvements and bugfixes

You can download the latest version of the tools here:

https://github.com/CopticScriptorium/coptic-nlp/

Or use our web interface, which has been updated with the latest version:

https://corpling.uis.georgetown.edu/coptic-nlp/

We appreciate your feedback and comments, and hope to release more data processed with these tools very soon!

On the Road Summer 2019

Coptic Scriptorium is busy this summer conference season.

I had the privilege of teaching one of the Sunoikisis Digital Classicist summer session earlier in July.

UCLA-St Shenouda Society image

The UCLA-St Shenouda Society conference participants, 2019

I also presented some research on girls and girlhood using the Coptic Scriptorium Corpora and the Online Coptic Dictionary at the annual UCLA-St. Shenouda Society Coptic Studies Conference.  This year was the 20th anniversary conference, and the theme was Shenoute and the White Monastery.

C. Schroeder presenting at ACH 2019; photo courtesy Melissa Dollman via Twitter

C. Schroeder presenting at ACH 2019; photo courtesy Melissa Dollman via Twitter

This week,  the American Digital Humanities organization, the Association for Computational Humanities, held a conference in Pittsburgh.  There I talked about colonialism, Coptic manuscripts, and resisting continuing colonialist tendencies in digitizing these manuscripts.

Meanwhile we’ve also been working on digitizing and annotating more texts, which we hope to release in the fall.

Happy summer everyone!

Congratulating our colleagues!

Two pillars in the fields of Digital Humanities, cultural heritage, and the manuscripts of the Eastern Mediterranean world received honors this month, and we at Coptic Scriptorium wish to congratulate them both.

Orlandi-by-Ciotti-DH2019

Tito Orlandi at DH2019, photo by Fabio Ciotti via Twitter

Dr.  Tito Orlandi was awarded the Busa Prize for lifetime achievement by the Alliance of Digital Humanities Organizations at the annual Digital Humanities Conference in Utrecht.  This honor is bestowed only every three years and thus is quite a distinguished award. Tito’s work in text encoding, developing stable identifiers for manuscripts, digital lexica, and digitization has been foundational for Coptic Studies.  He founded the Corpus dei Manoscritti Copti Letterari
(CMCL) project.

Columba Stewart, OSB, D.Phil, photograph from the HMML site

Columba Stewart, OSB, D.Phil, from the HMML site

Dr. Columba Stewart, Director of the Hill Museum and Manuscript Library at St. John’s University, has been named the Jefferson Lecturer by the National Endowment for the Humanities for 2019.  Other luminaries who have received this honor include Toni Morrison, John Updike, and others.  Columba’s scholarship on early monasticism—especially Evagrius and Cassian—is well-known, widely respected, and oft-cited.  He is being honored by the NEH in particular for his work at HMML to collaborate with communities in the Middle East to photograph and preserve manuscripts manuscripts from both Christian and Muslim communities and traditions that are endangered for various political, cultural, and geographic reasons.

On a personal note, Tito has been a supportive colleague long before Coptic Scriptorium existed.  At my first Congress of the International Association of Coptic Studies in Leiden in 2000, Tito chaired the session in which I gave my paper.  I will never forget when my slides first went up on the screen with one of my photographs of the White Monastery Church, he warmly remarked how happy it made him to see the White Monastery.   This sounds like a small thing, but for a grad student at this international conference for the first time, it was a reassuring way to start my paper.  When we began Coptic Scriptorium, Tito shared with us his digital lexica, which allowed us to shave at least a year off of our labors. Conversations with Tito over the years have enriched our work.

Likewise, Columba has been a kind and generous colleague and mentor since we first met in 1999 at the Oxford Patristics Conference.  Columba’s research on early monasticism has inspired me for a long time, and his work at HMML and the vHMML online reading room is a model for public-facing cultural heritage preservation and collaborations between American scholars and heritage communities in the Middle East.  Columba’s work is sometimes framed as saving manuscripts from ISIS, but Columba himself talks about the American role in the loss of cultural heritage in the Middle East and is, in my opinion, open about the geopolitical and colonialist power dynamics at work. As I said more informally to some friends on social media, Columba is 100% the real deal.

Additionally, for those of us who work on Christianity in the ancient Eastern Mediterranean and the languages and manuscripts of these communities, these two awards cast a warm glow over the whole field.  Thank you Columba and Tito for your work, and thank you to the ADHO and the NEH for honoring them and by extension their areas of work.

A warm, sincere congratulations to you both!

Comprehensive Coptic Lexicon v1 on Coptic Dictionary Online

The “Database and Dictionary of Greek Loanwords in Coptic” (DDGLC, Freie Universität Berlin), the research project “Strukturen und Transformationen des Wortschatzes der ägyptischen Sprache ”Thesaurus Linguae Aegyptiae” (TLA, Berlin-Brandenburgische Akademie der Wissenschaften) and “Coptic Scriptorium: Digital Research in Coptic Language and Literature” are happy to announce the release of version 1 of the “Comprehensive Coptic Lexicon“. The processed data has been published by the Coptic Dictionary Online:

  • Coptic Dictionary Online, ed. by the Koptische/Coptic Electronic Language and Literature International Alliance (KELLIA), https://coptic-dictionary.org/

The raw data can be downloaded at:

  • D. Burns, F. Feder, K. John, M. Kupreyev, et al. 12.5.2019. Comprehensive Coptic Lexicon: Including Loanwords from Ancient Greek, Berlin: Freie Universität Berlin, https://doi.org/10.17169/refubium-2333

The Comprehensive Coptic Lexicon includes ca. 8,000 Egyptian-Coptic lemmata with ca. 10,000 word forms, as well as ca. 3,250 Greek-Coptic lemmata with ca. 10,000 forms.

DDGLC, TLA and Coptic Scriptorium invite you to take a look at the new data and would welcome your feedback.

We’re hiring!

We are hiring a Digital Humanities specialist!  The full job ad is on the Georgetown U HR site. In short:

  • this is a part-time (12-15 hrs/week) paid DH Specialist position (so paid but no benefits);
  • knowledge of Coptic “strongly preferred”;
  • needs to be able to work in the United States legally;
  • living in Washington, DC, preferred but not required- remote ok if willing to attend virtual meetings and travel to DC on occasion for team meetings;
  • digital skills a plus but not required

Perfect for a grad student or other researcher in Classics, Linguistics, Near Eastern Studies, Ancient History, Papyrology, Religious Studies, etc., looking to add part-time work to their schedule AND expand their digital skill set.  Apply via the site! We are looking to hire soon.

 

Spring 2019 Corpora Release 2.7.0

We at Coptic Scriptorium are pleased to version 2.7.0 of our corpora.  The release includes several new documents:

  • several more sayings in the Coptic Apophthegmata Patrum (edited & annotated by Marina Ghaly)
  • additional fragments of Shenoute’s sermon Some Kinds of People Sift Dirt (edited & annotated by Christine Luckritz Marquis, editions provided by David Brakke)
  • Besa’s letter On Vigilance (edited and annotated by So Miyagawa and others)
  • several more fragments of the monastic canons of Apa Johannes (annotated by Elizabeth Platte and Caroline T. Schroeder, digital edition provided by Diliana Atanassova)

All documents have metadata for word segmentation, tagging, and parsing to indicate whether those annotations are machine annotations only (automatic), checked for accuracy by an expert in Coptic (checked), or closely reviewed for accuracy, usually as a result of manual parsing (gold).

You can search all corpora at https://corpling.uis.georgetown.edu/annis/scriptorium and download the data in 4 formats (relANNIS database files, PAULA XML files, TEI XML files, and SGML files in Tree-tagger format).

Our total annotated corpora are now at over 780,000 words; corpora that have human editors who reviewed the machine annotations amount to over 100,000 words.

Enjoy!

Older posts