Context Navigation

LanguageResourcesFromWeb

-                      v39
+                      v40
     * A bag of words + cosine similarity of word vectors approach is implemented in this script. //(For the sake of simplicity: A plagiarism cannot have more sources here.)//
     * You can modify the script to
+      * use other input attributes than the word or a combination of attributes, e.g. the lemma or the morphological tag,
+      * or implement other lexical/syntactic based detection approach, e.g. n-grams of words or Levenshtein's distance,
+      * or implement other semantic based detection approach, e.g. the similarity of {{{word2vec}}} vectors.
+      * use other input attributes than the word or a combination of attributes, e.g. the lemma or the morphological tag
+      * or implement other lexical/syntactic based detection approach, e.g. n-grams of words or Levenshtein's distance
+      * or implement other semantic based detection approach, e.g. the similarity of {{{word2vec}}} vectors
+      * or do it another way, be creative -- describe how it works in comments in the code.
   * Input format: A 3-column vertical, see above. [https://nlp.fi.muni.cz/trac/research/raw-attachment/wiki/en/NlpInPracticeCourse/LanguageResourcesFromWeb/training_data.vert training_data.vert]
   * Output: One plagiarism per line: id TAB detected source id TAB real source id. Evaluation line: precision, recall F1 measure.