New Corpora Release 4.5.0

Image of a search in ANNIS — Searching for “people” in Shenoute’s *God Who Alone is True*

We are pleased to announce release 4.5.0 of Coptic Scriptorium! Our data now includes over 1,278,500 tokens of searchable, linguistically analyzed Coptic data from dozens of ancient Coptic works (an increase of over 11,500 tokens from the previous release).

This release corrects a large number of consistency errors identified in our existing data, and also adds some new documents:

Revisions to five works of Besa:
Sections of three works by Shenoute of Atripe:
New documents were added to existing works:
- Acephalous Work 22
- Apophthegmata Patrum
Newly treebanked data with syntactic gold standard annotations for 1 Corinthians 7

We are very grateful to all of our collaborators and contributors, without whom this project could not function. We welcome Christine Ayad, Lydia Bremer-McCollum, Adeline Harrington, and Nina Speranskaja.

As with all releases, raw machine readable data for all corpora can be found, including morphological and syntactic analysis, as well as named entity recognition and entity linking, on our GitHub repository, in a variety of popular formats:

https://github.com/copticscriptorium/corpora

You can also search for complex linguistic annotations in the data using our ANNIS server – please see our new tutorial here to get started with some query tips and a helpful cheat sheet:

https://copticscriptorium.org/ANNIS_tutorial

We hope this release will be useful and look forward to the next one as always!

Coptic SCRIPTORIUM Blog

New Corpora Release 4.5.0

Related

Leave a Reply Cancel reply

Recent Posts

Categories

Tags

Follow us on Twitter

Meta

Coptic SCRIPTORIUM Blog

New Corpora Release 4.5.0

Share this:

Related

Previous post

Next post

Leave a Reply Cancel reply

Recent Posts

Categories

Tags

Follow us on Twitter

Meta