wiki:NerDataset

Version 3 (modified by xnovot32@fi.muni.cz, 20 měsíci ago) (diff)

--

A Human-Annotated Dataset for Language Modeling and Named Entity Recognition in Medieval Documents

This is an open dataset of sentences from 19th and 20th century letterpress reprints of documents from the Hussite era. The dataset contains human annotations for named entity recognition (NER).

You can download the dataset? in the LINDAT/CLARIAH-CZ repository.

Contents

Citing

If you use our dataset in your work, please cite the following article:

TODO

If you use LaTeX, you can use the following BibTeX entry:

TODO

Acknowledgements

This work was funded by TAČR Éta, project number TL03000365.