Version 3 (modified by 20 měsíci ago) (diff) | ,
---|
A Human-Annotated Dataset for Language Modeling and Named Entity Recognition in Medieval Documents
This is an open dataset of sentences from 19th and 20th century letterpress reprints of documents from the Hussite era. The dataset contains human annotations for named entity recognition (NER).
You can download the dataset? in the LINDAT/CLARIAH-CZ repository.
Contents
Citing
If you use our dataset in your work, please cite the following article:
TODO
If you use LaTeX, you can use the following BibTeX entry:
TODO
Acknowledgements
This work was funded by TAČR Éta, project number TL03000365.