Changes between Version 42 and Version 43 of private/NlpInPracticeCourse/AutomaticCorrection


Ignore:
Timestamp:
Dec 7, 2023, 9:19:02 AM (5 months ago)
Author:
Ales Horak
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • private/NlpInPracticeCourse/AutomaticCorrection

    v42 v43  
    6262   return set(deletes + transposes + replaces + inserts)
    6363}}}
    64 1. '''Edit distance 2'''(`edits2`) - applies `edits1()` to all the results of `edits1()`. Example: `len(edits2('something')) = 114 324` words, which is a high number. To enhance speed we can only keep the candidates that are actually known words (`known_edits2()`). Now `known_edits2('something')` is a set of just 4 words: `{'smoothing', 'seething', 'something', 'soothing'}`.
     641. '''Edit distance 2''' (`edits2`) - applies `edits1()` to all the results of `edits1()`. Example: `len(edits2('something')) = 114 324` words, which is a high number. To enhance speed we can only keep the candidates that are actually known words (`known_edits2()`). Now `known_edits2('something')` is a set of just 4 words: `{'smoothing', 'seething', 'something', 'soothing'}`.
    65651. The function `correct()` chooses as the set of candidate words the set with the '''shortest edit distance''' to the original word.
    6666 {{{
     
    8484 4. Modify the code of `spell.py` to increase accuracy (`pct`) at `tests2` by 10 %. You may take an inspiration from the ''Future work'' section of [http://norvig.com/spell-correct.html the Norvig's article]. Describe your changes and write your new accuracy results to `<YOUR_FILE>`.
    8585
     86 5. Upload `<YOUR_FILE>` and the edited `spell.py` to the [/en/NlpInPracticeCourse homework vault (odevzdávárna)].
    8687
    87 === Upload `<YOUR_FILE>` and the edited `spell.py` ===
    8888
    8989== Task 2: Rule based grammar checker (punctuation) for Czech == #task2