Version 2 (modified by 14 měsíci ago) (diff) | ,
---|
Named Entity Recognition in Medieval Documents
This page lists the resources for named entity recognition in medieval documents that were developed in the AHISTO project.
- A human-annotated dataset for language modeling and named entity recognition in medieval documents
- Large language models for named entity recognition in medieval documents:
- Model L: our largest and most accurate model
- Model S: a smaller and more efficient variant of Model L, used in our named entity recognition tool
- Model TDS1 and Model TDS2: two smaller models trained on data of different size as a part of our ablation study
- Model CI: a smaller model trained with a loss function that does not address class imbalance as a part of our ablation study
- Named entity experiments: experimental code for named entity search and recognition in medieval documents
- Named entity recognition tool: a command-line tool for named entity recognition in medieval documents, see also an online demo (in Czech)
Citing
An article describing our dataset is currently under review. Preprint is available on request.