Named Entity Recognition in Medieval Documents
This page lists resources for named entity recognition in medieval documents that were developed in the AHISTO project.
- A human-annotated dataset for language modeling and named entity recognition in medieval documents
- Large language models for named entity recognition in medieval documents:
- Model L: the largest and most accurate of the AHISTO NER models
- Model S: a smaller and more efficient variant of Model L, used in the AHISTO named entity recognition tool
- Model TDS1 and Model TDS2: two smaller models trained on data of different size as a part of our ablation study
- Model CI: a smaller model trained with a loss function that does not address class imbalance as a part of our ablation study
- Named entity experiments: experimental code for named entity search and recognition in medieval documents
- Named entity recognition tool: a command-line tool for named entity recognition in medieval documents, see also an online demo (in Czech)
- Named entity postprocessing and declension: command-line tools for the postprocessing of named entities before they are displayed on the AHISTO portal
Citing
An article describing our dataset is currently under review. Preprint is available on ArXiv.
Last modified 12 měsíci ago
Naposledy změněno 29. 5. 2023 9:13:40