= Automatic language correction = [[https://is.muni.cz/auth/predmet/fi/ia161|IA161 Advanced NLP Course]], Course Guarantee: Aleš Horák Prepared by: Ján Švec == State of the Art == Automatic language correction (spell checking) is the process of detecting and sometimes providing spelling suggestions for incorrectly spelled words in a text. Language correction nowadays has many potential applications on large amount of informal and unedited text generated online, among other things: web forums, tweets, blogs, and email. In the theoretical lesson we will introduce and compare various methods to automatcally propose and choose a correction for an incorrectly written word. The lesson will also answer a question "How difficult is to develop a spell-checker?". And also describe a system that performs spellchecking and autocorrection. In the end there will be a brief overwiev of various applications (computer software) for automatic language correction. === References === 1. CHOUDHURY, Monojit, et al. "How Difficult is it to Develop a Perfect Spell-checker? A Cross-linguistic Analysis through Complex Network Approach" TextGraphs-2: Graph-Based Algorithms for Natural Language Processing, pages 81–88, Rochester, 2007. [[http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=52A3B869596656C9DA285DCE83A0339F?doi=10.1.1.146.4390&rep=rep1&type=pdf|Source]] 1. WHITELAW, Casey, et al. "Using the Web for Language Independent Spellchecking and Autocorrection" Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 890–899, Singapore, 2009. [[http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/36180.pdf|Source]] 1. GUPTA, Neha, MATHUR, Pratistha. "Spell Checking Techniques in NLP: A Survey" International Journal of Advanced Research in Computer Science and Software Engineering, volume 2, issue 12, pages 217-221, 2012. [[http://www.ijarcsse.com/docs/papers/12_December2012/Volume_2_issue_12_December2012/V2I12-0164.pdf|Source]] 1. HLADEK, Daniel, STAS, Jan, JUHAR, Jozef. "Unsupervised Spelling Correction for the Slovak Text." Advances in Electrical and Electronic Engineering 11 (5), pages 392-397, 2013. [[http://advances.utc.sk/index.php/AEEE/article/view/898|Source]] == Practical Session == There will be a short overview of [[https://www.languagetool.org/|LanguageTool]] - Style and Grammar checker. Students can test the complete algorithm, and evaluate it on real data. After they become acquainted with how a spelling corrector works, we will write a simple spelling corrector in Python. The spelling corrector will be trained on a large text file compiled from Project Gutenberg (Free ebooks). The example will be based on Peter Norvig's [[http://norvig.com/spell-correct.html|Spelling Corrector]] in python. If the student finishes early the additional task is to enhance the spelling corrector's functionality.