| 1 | = Semantic Analysis = |
| 2 | |
| 3 | == Presentation outline == |
| 4 | |
| 5 | == Will computers ever understand us? Understanding of ''understanding'' == |
| 6 | |
| 7 | === Aims: inappropriate, naughty, vulgar, silly posts detection === |
| 8 | |
| 9 | Use case: discussion forum, automatic detection of inappropriate posts[[BR]] |
| 10 | Common solution: word list[[BR]] |
| 11 | But: users use obfuscated words that are difficult to detect (f*king, f.u.c.k,f..k)[[BR]] |
| 12 | Better solution: word list + obfuscation rules[[BR]] |
| 13 | But: users invent new obfuscation patterns[[BR]] |
| 14 | Even better solution: word list + automatically generated thesaurus + obfuscation rules + naughty language patterns (e.g you <adjective> <noun>!!!) |
| 15 | |
| 16 | === Aims: text summarization === |
| 17 | |
| 18 | Use case: automatic abstract generation, multiple document digest, are these documents stating similar or oposite theses?[[BR]] |
| 19 | Common solution: take every first sentence in a paragraph or take every sentence containing a keyword[[BR]] |
| 20 | But: works worse on Slavic languages, is not really scalable, almost impossible to detect the main thesis[[BR]] |
| 21 | Better solution: analyse text on several levels |
| 22 | * as a whole discourse (sections, paragraphs, references) |
| 23 | * as a sequence of sentences (each having a structure) |
| 24 | * as a bag of words and keywords (in different forms, synonyms, abbreviations etc.) |
| 25 | * main theses detection |
| 26 | * text generation |
| 27 | |
| 28 | == Aims: opinion mining == |
| 29 | (this part may be replaced by ''content targeting'') |
| 30 | |
| 31 | Use case: what are people thinking about a particular product/company/idea X?[[BR]] |
| 32 | Solution: search X[[BR]] |
| 33 | But: what other names a people giving to X? what are people saying about X?[[BR]] |
| 34 | Better solution: |
| 35 | * found synonyms for X |
| 36 | * extract useful attributes of X (noise, weight, price, appearance) |
| 37 | * generate thesauri of opinion words (weird rattle in iPhone5?) |
| 38 | |
| 39 | == Aims: question answering == |
| 40 | Use case: chatbot providing basic support (do you have a phone similar to Sony Xperia Z but cheaper? what is the shipping cost?)[[BR]] |
| 41 | Solution: patterns, keyword detection (Sony Xperia Z, shipping), then searching[[BR]] |
| 42 | But: no real dialogue, no real answers just searching[[BR]] |
| 43 | Better solution: sentence structure analysis, keyword detection, coreference resolution, dialogue strategy[[BR]] |
| 44 | |
| 45 | Is this real understanding? Will computers understand us? No. We don’t know what understanding is but we know how ''it looks like'' when someone understands. Computer programs that can discover a vulgar text, summarize a text, answer questions, “feel” emotions look like they understand our language... (in fact this is a ''behaviorist approach''). |