Version 1 (modified by před 2 lety) (diff) | ,
---|
Named Entity Recognition in Medieval Documents
This page lists the resources for named entity recognition in medieval documents that were developed in the AHISTO project.
- A human-annotated dataset for language modeling and named entity recognition in medieval documents
- Large language models for named entity recognition in medieval documents:
- Model L: our largest and most accurate model
- Model S: a smaller and more efficient variant of Model L, used in our named entity recognition tool
- Model TDS1 and Model TDS2: two smaller models trained on data of different size as a part of our ablation study
- Model CI: a smaller model trained with a loss function that does not address class imbalance as a part of our ablation study
- Named entity experiments: experimental code for named entity search and recognition in medieval documents
- Named entity recognition tool: a command-line tool for named entity recognition in medieval documents
Citing
An article describing our dataset is currently under review. Preprint is available on request.