Author: David Sriboonreuang (page 1 of 2)

2015 Society of Biblical Literature (SBL) Annual Meeting

The 2015 SBL conference is the largest gathering of scholars interested in the study of religion in the world. The latest research and the fostering of collegian contacts is present at each showcase. The 2015 meeting is taking place in Atlanta , GA and members of the Coptic SCRIPTORIUM  team will be presiding at the conference.

Digital Humanities in Biblical, Early Jewish, and Christian Studies; Papyrology and Early Christian Backgrounds will have Caroline T. Schroeder chairing , Taking place in International A (International Level) – Marriot on 11/23/2015 at 9:00 AM to 11:30 AM.

Also,

“A Wild Patience Has Taken this Far”: Future Avenues of Feminist Scholarship will also have Caroline T. Schroeder presenting, Taking place in Room A602 (Atrium  Level) – Marriot at 4:00 PM to 6:30 PM.

and

Art and Religions of Antiquity; Violence and Representations of Violence among Jews and Christians will have Christine Luckritz Marquis chairing, Taking place in Room 209 (Level 2) – Hilton on 11/23/2015 at 9:00 AM to 11:30 AM.

We are looking forward to their presentations and hope others are as well!

Coptic NLP pipeline Part 2

With the creation of the Coptic NLP (Natural Language Processor) pipeline by Amir Zeldes, it is now possible to run all our NLP tools simultaneously without the need to individually download and run them. The web application will tokenize bound groups into words, and will normalize the spelling of words and diacritics. It will also tag for part-of-speech, lemmatize, and tag for language of origin for borrowed (foreign) words. The interface is XML tolerant (preserves tags in the input) and the output is tagged in SGML. One of the options is to encode the lines breaks in a word or sentence which is useful for encoding manuscripts. However, keep in mind to double check results because the interface is still in the beta stage.

As an example, the screenshot below is a snippet from I See Your Eagerness from manuscript MONB.GL29.

 

1.1

Notice it contains an XML tag to encode a letter as “large ekthetic”. “Large ekthetic” corresponds to the alpha letter to designate it as a large character in the left margin of the manuscript’s column of text.  This tag will be preserved in the output.

2

The results are shown above. Bounds group are shown and along with the part of speech tag abbreviated as “pos”. The snippet from I See Your Eagerness has also been lemmatized, shown as “lemma”. Also, near the bottom of the screenshot, the language of origin of borrowed (foreign) words in the snippet has been identified as “Greek”.  These tags also correspond to the annotation layers you see in our multi-layer search and visualization tool ANNIS.

We hope the NLP service serves you well.

 

Introducing the Lemmatizer Tool

A new tool available at the Coptic SCRIPTORIUM webpage is the lemmatizer. The lemmatizer annotates words with their dictionary head word. The purpose of lemmatization is to group together the different inflected forms of a word so they can be analyzed as a single item.

For example, in English, the verb ‘to walk’ may appear as ‘walk’, ‘walked’, ‘walks’, and ‘walking’. The base form, ‘walk’, might be the word to look up in the dictionary, and it would be called the lemma for the word.

In Coptic, plural nouns sometimes have different forms, and verbs have different forms.  A lemmatized corpus is useful for searching all the forms of a word and also if you want to link all the forms of a word to an online dictionary for future use.

Two of the corpora we have are annotated with lemmas: Not because a fox barks (Shenoute) and the Apophthegmata. As illustrated in the image below, I have searched for ⲟⲩⲱϩ, to live or dwell.

1

Also note that in the corpus list, I have chosen to look in the corpus ‘Not Because a Fox Barks’, as indicated by the highlighted blue selection.

scriptorium ANNIS Corpus Search

Notice the word forms corresponding to the lemma I have searched for becomes highlighted in the corpus that was chosen.  Two forms of the verb ⲟⲩⲱϩ appear in the results:  ⲟⲩⲱϩ and ⲟⲩⲏϩ.  In addition, there is also an annotation grid.

Desctop screenshot

Clicking on the annotations grid reveals a plethora of information including the translation of the text along with its parts of speech. Hovering over the text using your computer’s mouse allows you to also find parts that may be related. For example, below  the POS (part of speech) is V (verb), and when the mouse is hovering over V, a highlight indicates what word in the text the verb is referring to.

2

3

The tool is a feature in our part-of-speech tagger, so you can lemmatize at the same time you annotate a corpus for parts of speech.  See https://github.com/CopticScriptorium/tagger-part-of-speech/.

Additional guidelines are available here:  https://github.com/CopticScriptorium/tagger-part-of-speech/blob/master/Coptic%20SCRIPTORIUM%20lemmatization%20guidelines.pdf

Wishing the NEH a happy 50th Anniversary!

The Coptic SCRIPTORIUM team would like to wish the National Endowment for the Humanities (NEH), a happy 50th anniversary! We would like to thank the NEH for supporting Coptic SCRIPTORIUM. Cheers to the NEH!

50thsocialengagement

Releasing new translation of section of Shenoute’s Acephalous Work 22

An English Translation (by Anthony Alcock) of part of Shenoute’s Acephalous Work 22 is available.  Anthony Alcock of the University of Kassel has contributed a translation of White Monastery Manuscript YA (MONB.YA) pages 421-28. This section corresponds to Leipoldt’s vol. 4, pp. 124-29. Coptic, English, and various annotations are available. Many thanks to Dr. Alcock for the contribution! We are in the process of a major addition to our website functionality, to enable you to read and find these texts more easily. In the meantime, you can access the text via our ANNIS search and visualization tool.  Click on the little page icon next to the shenoute.a22 corpus listing to see the visualizations.

Screen Shot 2015-06-11 at 3.50.07 PM of ANNIS corpus list

List of corpora in ANNIS

Read the English translation directly in the linguistic analysis view; read it as a pop-up when you hover over the Coptic in the normalized view.

screenshot: list of visualizations in ANNIS

Or search the English in ANNIS using a search string; to search for the word “work” in the English translations of Acephalous Work 22, use translation=/.*work.*/.

(Originally posted in March 2015 at http://copticscriptorium.org/)

Entire Sahidica New Testament now available

The entire Sahidica New Testament (machine-annotated) is now available. It has been tokenized and tagged for part of speech entirely automatically, using our tools. There has been no manual editing or correction. Visit our corpora for more information, or just jump in and search it in ANNIS.

 

(Originally posted in March 2015 at http://copticscriptorium.org/)

Previous Digital Coptic 2 Symposium and Workshop

On March 12-13 we hosted the Digital Coptic 2 Symposium and Workshop at Georgetown University, March 12-13, 2015. The full program is online. Day 1 featured presentations from scholars working in Coptic and/or Digital Humanities from around the world. Day 2 provided tutorials on Coptic SCRIPTORIUM along with discussions about future research. Watch the many videos of the presentations on our DC 2 YouTube channel along with reading the twitter backchannel at #copticdh.

(Oringally posted in March 2015 at http://copticscriptorium.org/)

Corpora and how to use ANNIS

Coptic SCRIPTORIUM provides Coptic texts for reading, analysis, and complex searches. For a full list of our text corpora, please click here. We have also added answers to who and what some people and terms mean on our main site. A video tutorial given by Amir Zeldes and Caroline T. Schroeder is also available on how to search our database using the tool ANNIS.

 

(Originally posted in December of 2014 at http://copticscriptorium.org/)

Digital Coptic workshop and symposium at Georgetown University

Coptic SCRIPTORIUM is hosting a second workshop and symposium on Digital Humanities and Coptic Studies on March 12-13, 2015 at Georgetown University in Washington, DC. Program and Registration information is available.

 

(Previously posted in December 2014 on http://copticscriptorium.org/index.html)

Fall 2014 Release Notes

A new release of material has been added to http://www.copticscriptorium.org:  more Sayings from the Coptic Apophthegmata Patrum, chapters 1 of Corinthians and additional chapters of the Gospel of Mark. Other release notes can be found here.

(Originally posted in Fall 2014 on http://www.copticscriptorium.org.)

Older posts