A Java library for analyzing morphology and part of speech information for Latvian words. Accurate analysis is based on lexeme data periodically updated from Tezaurs.lv database. Also includes generation of all inflections of a word, and crude statistical disambiguation for analysis.
Analyzer analyzer = new Analyzer();
// analysis
Word result = analyzer.analyze("roku");
for (Wordform wf : result.wordforms) {
wf.describe();
}
// generation of inflections
List<Wordform> wordforms = analyzer.generateInflections("rakt");
for (Wordform wf : wordforms) {
wf.describe();
}
Review unit tests for more examples.
Use maven to build and deploy. The published releases should be available at Maven Central https://central.sonatype.com/artifact/lv.ailab.morphology/morphology
Packaging instructions at docs/deployment.md
(c) Institute of Mathematics and Computer Science, University of Latvia, 2005-2026
This software is licenced under GNU General Public Licence. Commercial licencing is available if neccessary, contact us at [email protected].
Current usage is described at http://www.ep.liu.se/ecp_article/index.en.aspx?issue=085;article=024 The initial core algorithm is published at http://www.semti-kamols.lv/doc_upl/Kamols-Kaunas-paper-3.pdf
P.S. Since 2026-02 development of the morphology library has been continued here here.