The tokenizer has been updated! Version 3.0 is now on GitHub.  It has introduced a training data component that learns from our annotators’ most common tokenization and correction practices.  The tokenizer breaks Coptic text segmented as bound groups into morphemes for analysis/annotation

(Originally posted on copticscriptorium.org on 5/22/15.)