Version 1 (modified by x413827, 20 months ago) (diff)




The czaccent system adds diacritics into Czech text without diacritics; it uses statictical evaluation of all possible variants. The working data was trained on a very large Czech corpus. The system can be used as a command line tool, or a web-service. It is also available as API, see <a href="">.

More information about the system can be found in RYCHLÝ, Pavel. CzAccent - Simple Tool for Restoring Accents in Czech Texts. In Aleš Horák, Pavel Rychlý (eds.). 6th Workshop on Recent Advances in Slavonic Natural Language Processing. Brno: Tribun EU, 2012. s. 15-22. ISBN 978-80-263-0313-8.

How to use the tool

You can insert text long up to tens of kB into the input field. If the conversion doesn't work for longer texts, the problem is in your browser or on the way to server (firewall, proxy).

The data from the form can be saved in two ways - by default, for the purposes of improving the service, we save the entered plain text (and no other data related to the query) as test data. If you choose the 'Neukládat text' option, we only save the IP address of the request so that we know how often this service is requested. You can request that your data be deleted if necessary by emailing the contact below. You must specify the text or IP address in the request as we are unable to identify them otherwise.

[tady by měl být funkční formulář (viz původní stránku)]


This software was developed within the projects LC536 and 2C06009 and is owned by Masaryk University, Faculty of Informatics, NLP Centre.

If you use the system, please, cite the related publication as well as the LINDAT/CLARIAH infrastructure: [link do repozitáře (handle daného submission)]

author = {Rychlý, Pavel},
address = {Brno},
booktitle = {6th Workshop on Recent Advances in Slavonic Natural Language Processing},
editor = {Aleš Horák, Pavel Rychlý},
location = {Brno},
isbn = {978-80-263-0313-8},
pages = {15-22},
publisher = {Tribun EU},
title = {CzAccent - Simple Tool for Restoring Accents in Czech Texts},
year = {2012}


License terms can be found here.

Attachments (1)

Download all attachments as: .zip