LINDAT/CLARIAH-CZ
LINDAT/CLARIAH-CZ is a Czech centre for data providing certified storage and natural language processing services. It is a unification of two research infrastructures – LINDAT/CLARIN and DARIAH-CZ – which deals primarily with language data but also with other digital resources and tools for their exploitation, maintenance, and enhancement and offers them to the research community, to industry for the development of applications and in specific cases, such as e.g. language culture, also directly to the public domain.
More information about the infrastructure can be found at https://lindat.cz/, you can find the repository here
If you are a part of the NLP Centre and want to publish your results, you can find the instructions in AddToInfrastructure. Do not forget to AcknowledgeLindat.
LINDAT/CLARIAH-CZ Project Info
Masaryk University provides more information about the current project at https://www.fi.muni.cz/app/projects?project=69919
FI MU and LINDAT
We contribute to LINDAT with services and data resources. Apart from that, LINDAT serves as a sustainability platform for projects solved within different grant calls.
Services and Tools
- CzAccent - diacritics restoration service
- CharedTool - character encoding of a text document detection tool
- JusText - removal tool useful for cleaning documents in large textual corpora
- OniOn - carpus duplicate parts removal tool
Data resources
- SqadDatabase - Simple question answering database version 3.2
- MedievalNamedEntities - A Human-Annotated Dataset for Language Modeling and Named Entity Recognition in Medieval Documents
- BulkyCorpus - subcorpus of the czTenTen12 corpus with interlingual homographs
- CzechMwes - list of Czech multi-word expressions
- DigitalHumanities - 10-week online course
- AGREE
- Czes
- SHOLVA
- UJC
- Integrated lexicographic platform for Russian??
Projects' sustainability
- HaBiT - Harvesting big text data for under-resourced languages (2014-2017)
Publications
Attachments (1)
- screen01.png (30.9 KB) - added by 10 months ago.
Download all attachments as: .zip