= Semantic Analysis =

== Presentation outline ==

== Will computers ever understand us? Understanding of ''understanding'' ==

    === Aims: inappropriate, naughty, vulgar, silly posts detection ===

    Use case: discussion forum, automatic detection of inappropriate posts[[BR]]
    Common solution: word list[[BR]]
    But: users use obfuscated words that are difficult to detect (f*king, f.u.c.k,f..k)[[BR]]
    Better solution: word list + obfuscation rules[[BR]]
    But: users invent new obfuscation patterns[[BR]]
    Even better solution: word list + automatically generated thesaurus + obfuscation rules + naughty language patterns (e.g you <adjective> <noun>!!!)

    === Aims: text summarization ===

    Use case: automatic abstract generation, multiple document digest, are these documents stating similar or oposite theses?[[BR]]
    Common solution: take every first sentence in a paragraph or take every sentence containing a keyword[[BR]]
    But: works worse on Slavic languages, is not really scalable, almost impossible to detect the main thesis[[BR]]
    Better solution: analyse text on several levels
        * as a whole discourse (sections, paragraphs, references)
        * as a sequence of sentences (each having a structure)
        * as a bag of words and keywords (in different forms, synonyms, abbreviations etc.)
        * main theses detection
        * text generation

==    Aims: opinion mining ==
(this part may be replaced by ''content targeting'')

    Use case: what are people thinking about a particular product/company/idea X?[[BR]]
    Solution: search X[[BR]]
    But: what other names a people giving to X? what are people saying about X?[[BR]]
    Better solution:
 *        found synonyms for X
 *         extract useful attributes of X (noise, weight, price, appearance)
 *         generate thesauri of opinion words (weird rattle in iPhone5?)

    == Aims: question answering ==
    Use case: chatbot providing basic support  (do you have a phone similar to Sony Xperia Z but cheaper? what is the shipping cost?)[[BR]]
    Solution: patterns, keyword detection (Sony Xperia Z, shipping), then searching[[BR]]
    But: no real dialogue, no real answers just searching[[BR]]
    Better solution: sentence structure analysis, keyword detection, coreference resolution, dialogue strategy[[BR]]

Is this real understanding? Will computers understand us? No. We don’t know what understanding is but we know how ''it looks like'' when someone understands. Computer programs that can discover a vulgar text, summarize a text, answer questions, “feel” emotions look like they understand our language... (in fact this is a ''behaviorist approach'').