Changes between Version 2 and Version 3 of private/AdvancedNlpCourse/AutomaticCorrection


Ignore:
Timestamp:
Jun 2, 2015, 3:11:06 PM (6 years ago)
Author:
xsvec3
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • private/AdvancedNlpCourse/AutomaticCorrection

    v2 v3  
    33[[https://is.muni.cz/auth/predmet/fi/ia161|IA161 Advanced NLP Course]], Course Guarantee: Aleš Horák
    44
    5 Prepared by: Ján Švec
    6 
    7 == TODO til 31.5.2015 ==
    8 
    9  1. choose particular papers for [[#References|References]] below (that will serve as input for the lecture later on)
    10  1. prepare the [[#PracticalSession|Practical Session]]
     5Prepared by: Ján Švec 
    116
    127== State of the Art ==
     8Automatic language correction (spell checking) is the process of detecting and sometimes providing spelling suggestions for incorrectly spelled words in a text. Language correction nowadays has many potential applications on large amount of informal and unedited text generated online, among other things: web forums, tweets, blogs, and email.
    139
    14 === References ===
     10In the theoretical lesson we will introduce and compare various methods to automatcally propose and choose a correction for an incorrectly written word. The lesson will also answer a question "How difficult is to develop a spell-checker?". And also describe a system that performs spellchecking and autocorrection.
    1511
    16 Approx 3 current papers (preferably from best NLP conferences/journals, eg. [[https://www.aclweb.org/anthology/|ACL Anthology]]) that will be used as a source for the one-hour lecture:
     12In the end there will be a brief overwiev of various applications (computer software) for automatic language correction.
    1713
    18  1. HLADEK, D, STAS, J, JUHAR, J. "Unsupervised Spelling Correction for the Slovak Text." Advances in Electrical and Electronic Engineering, 2013.
    19  1. paper2
    20  1. paper3
     14=== References ===   
     15
     16  1. CHOUDHURY, Monojit, et al. "How Difficult is it to Develop a Perfect Spell-checker? A Cross-linguistic Analysis through Complex Network Approach" TextGraphs-2: Graph-Based Algorithms for Natural Language Processing, pages 81–88, Rochester, 2007. [[http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=52A3B869596656C9DA285DCE83A0339F?doi=10.1.1.146.4390&rep=rep1&type=pdf|Source]]         
     17  1. WHITELAW, Casey, et al. "Using the Web for Language Independent Spellchecking and Autocorrection" Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 890–899, Singapore, 2009. [[http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/36180.pdf|Source]]   
     18  1. GUPTA, Neha, MATHUR, Pratistha. "Spell Checking Techniques in NLP: A Survey" International Journal of Advanced Research in Computer Science and Software Engineering, volume 2, issue 12, pages 217-221, 2012. [[http://www.ijarcsse.com/docs/papers/12_December2012/Volume_2_issue_12_December2012/V2I12-0164.pdf|Source]] 
     19  1. HLADEK, Daniel, STAS, Jan, JUHAR, Jozef. "Unsupervised Spelling Correction for the Slovak Text." Advances in Electrical and Electronic Engineering 11 (5), pages 392-397, 2013.  [[http://advances.utc.sk/index.php/AEEE/article/view/898|Source]]
    2120
    2221== Practical Session ==
    23 
    24 Concrete description of work assignment for students for the second one-hour part of the lecture. The work will consist of tasks connected with practical implementations of algorithms connected with the current topic (probably not the state-of-the-art algorithms mentioned in the first part) and with real data. Students can test the algorithms, evaluate them and possibly try some short adaptations for various subtasks.
    25 
    26 Students can also be required to generate some results of their work and hand them in to prove completing the tasks.
     22There will be a short overview of [[https://www.languagetool.org/|LanguageTool]] - Style and Grammar checker. Students can test the complete algorithm, and evaluate it on real data. After they become acquainted with how a spelling corrector works, we will write a simple spelling corrector in Python. The spelling corrector will be trained on a large text file compiled from Project Gutenberg (Free ebooks). The example will be based on Peter Norvig's [[http://norvig.com/spell-correct.html|Spelling Corrector]] in python. If the student finishes early the additional task is to enhance the spelling corrector's functionality.