Free natural language morphology

This page provides a very fast (approx. 1M words per second) free morphological analyzer Majka including databases for Czech, Slovak, Polish, Swedish, German, French, Italian, English, Portuguese, Catalan, Welsh, Spanish, Galician, Asturian and Russian.

Free morphological analyzer Majka

Binaries

Download Majka binaries for Linux / for Windows.

For the Windows OS the requested encoding is Windows 1250.

$ chcp 1250

Usage: program expects one entry (word, lemma, or string lemma:tag, according the data file in use) per line on its standard input and prints the requested information on its standard output. An example of usage (for other options see majka -h):

$ echo test | majka -f majka.w-lt
test:k1gInSc1
test:k1gInSc4
test:k1gMnSc1
testa:k1gFnPc2

Source codes

References

When using majka for research purposes, please cite:

Pavel Šmerk. Fast Morphological Analysis of Czech. In Petr Sojka and Aleš Horák. Proceedings of Third Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2009. Brno : Masaryk University, 2007. p. 13–16. ISBN 978-80-210-5048-8.

Free morphological databases for Majka

Contact

Pavel Šmerk, Ph.D.
ma@nlp.fi.muni.cz
Natural Language Processing Centre
Faculty of Informatics, Masaryk University
Botanická 68a, 602 00, Brno, Czech Republic

Developers information in Czech