rozac/morfologik-stemming
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
MORFOLOGIK
==========+*
FSA (automata), stemming, dictionaries and tools. Tools quickstart:
java -jar lib/morfologik-tools-${version}-standalone.jar
MODULES
=======
This project provides:
morfologik-fsa:
- Creation of byte-based, efficient finite state automata in Java, including
custom, efficient data storage formats.
- Compatibility with FSA5, binary format of finite state automata produced by
Jan Daciuk's "fsa" package.
morfologik-stemming:
- FSA-based stemming interfaces and dictionary metadata.
morfologik-polish:
- Precompiled dictionary of inflected forms, stems and tags for the Polish
language built on top of a large dictionary.
morfologik-tools:
- Command line tools to preprocess, build and dump FSA automata and dictionaries.
- There are a few command-line tools you may find useful. Type:
java -jar lib/morfologik-tools-${version}.jar
for an up-to-date list of all tools.
morfologik-speller:
- Simplistic automaton-based spelling correction (suggester).
AUTHORS
=======
Marcin Miłkowski (http://marcinmilkowski.pl) [linguistic data lead, code]
Dawid Weiss (http://www.dawidweiss.com) [fsa lead, code]
CONTRIBUTORS
============
Grzegorz Słowikowski [initial maven configs]
QUESTIONS, COMMENTS
===================
Web site: http://www.morfologik.blogspot.com
Mailing list: [email protected]