Changes between Version 2 and Version 3 of private/NlpInPracticeCourse/AutomaticCorrection

Jun 2, 2015, 3:11:06 PM (8 years ago)



  • private/NlpInPracticeCourse/AutomaticCorrection

    v2 v3  
    33[[|IA161 Advanced NLP Course]], Course Guarantee: Aleš Horák
    5 Prepared by: Ján Švec
    7 == TODO til 31.5.2015 ==
    9  1. choose particular papers for [[#References|References]] below (that will serve as input for the lecture later on)
    10  1. prepare the [[#PracticalSession|Practical Session]]
     5Prepared by: Ján Švec 
    127== State of the Art ==
     8Automatic language correction (spell checking) is the process of detecting and sometimes providing spelling suggestions for incorrectly spelled words in a text. Language correction nowadays has many potential applications on large amount of informal and unedited text generated online, among other things: web forums, tweets, blogs, and email.
    14 === References ===
     10In the theoretical lesson we will introduce and compare various methods to automatcally propose and choose a correction for an incorrectly written word. The lesson will also answer a question "How difficult is to develop a spell-checker?". And also describe a system that performs spellchecking and autocorrection.
    16 Approx 3 current papers (preferably from best NLP conferences/journals, eg. [[|ACL Anthology]]) that will be used as a source for the one-hour lecture:
     12In the end there will be a brief overwiev of various applications (computer software) for automatic language correction.
    18  1. HLADEK, D, STAS, J, JUHAR, J. "Unsupervised Spelling Correction for the Slovak Text." Advances in Electrical and Electronic Engineering, 2013.
    19  1. paper2
    20  1. paper3
     14=== References ===   
     16  1. CHOUDHURY, Monojit, et al. "How Difficult is it to Develop a Perfect Spell-checker? A Cross-linguistic Analysis through Complex Network Approach" TextGraphs-2: Graph-Based Algorithms for Natural Language Processing, pages 81–88, Rochester, 2007. [[;jsessionid=52A3B869596656C9DA285DCE83A0339F?doi=|Source]]         
     17  1. WHITELAW, Casey, et al. "Using the Web for Language Independent Spellchecking and Autocorrection" Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 890–899, Singapore, 2009. [[|Source]]   
     18  1. GUPTA, Neha, MATHUR, Pratistha. "Spell Checking Techniques in NLP: A Survey" International Journal of Advanced Research in Computer Science and Software Engineering, volume 2, issue 12, pages 217-221, 2012. [[|Source]] 
     19  1. HLADEK, Daniel, STAS, Jan, JUHAR, Jozef. "Unsupervised Spelling Correction for the Slovak Text." Advances in Electrical and Electronic Engineering 11 (5), pages 392-397, 2013.  [[|Source]]
    2221== Practical Session ==
    24 Concrete description of work assignment for students for the second one-hour part of the lecture. The work will consist of tasks connected with practical implementations of algorithms connected with the current topic (probably not the state-of-the-art algorithms mentioned in the first part) and with real data. Students can test the algorithms, evaluate them and possibly try some short adaptations for various subtasks.
    26 Students can also be required to generate some results of their work and hand them in to prove completing the tasks.
     22There will be a short overview of [[|LanguageTool]] - Style and Grammar checker. Students can test the complete algorithm, and evaluate it on real data. After they become acquainted with how a spelling corrector works, we will write a simple spelling corrector in Python. The spelling corrector will be trained on a large text file compiled from Project Gutenberg (Free ebooks). The example will be based on Peter Norvig's [[|Spelling Corrector]] in python. If the student finishes early the additional task is to enhance the spelling corrector's functionality.