| 2 | |
| 3 | [[Image(/trac/research/raw-attachment/wiki/en/WillComputerUnderstand/turing.png)]] |
| 4 | |
| 5 | == Computer “understanding”: Use cases == |
| 6 | |
| 7 | * inappropriate discussion posts detection |
| 8 | * text summarization |
| 9 | * opinion mining |
| 10 | * content targeting |
| 11 | * question answering |
| 12 | |
| 13 | |
| 14 | == Inappropriate discussion posts detection == |
| 15 | |
| 16 | That BOY aint done growing and fcuking so she would be stooopid to tie HERSELF down wit a BABY and a tattoo is just as worse!!! |
| 17 | |
| 18 | => |
| 19 | |
| 20 | That BOY aint done growing and '''fcuking''' so she would be '''stooopid''' to tie HERSELF down wit a BABY and a tattoo is just as worse!!! |
| 21 | |
| 22 | [[Image(/trac/research/raw-attachment/wiki/en/WillComputerUnderstand/fcuking.png, align=right)]] |
| 23 | |
| 24 | '''Use case:''' discussion forum, automatic detection of inappropriate posts |
| 25 | |
| 26 | '''Common solution:''' word list |
| 27 | |
| 28 | '''But:''' users use concealed words that are difficult to detect (f*king, f.u.c.k, f..k, fcuking) |
| 29 | |
| 30 | '''Better solution:''' word list + concealing rules |
| 31 | |
| 32 | '''But:''' users invent new words and concealing patterns |
| 33 | |
| 34 | '''Even better solution:''' word list + automatically generated thesaurus + concealing rules + metarules |
| 35 | |
| 36 | |
| 37 | |
| 38 | == Text summarization == |
| 39 | |
| 40 | [[Image(/trac/research/raw-attachment/wiki/en/WillComputerUnderstand/text_sum.png)]] |
| 41 | |
| 42 | A decade ago a girl spent money for ringtones. |
| 43 | |
| 44 | |
| 45 | '''Use case:''' automatic abstract generation, multiple document digest, are these documents stating similar or opposite arguments? |
| 46 | |
| 47 | '''Naive solution:''' take every first sentence in a paragraph |
| 48 | |
| 49 | '''Common solution:''' take every sentence containing a keyword |
| 50 | |
| 51 | '''But:''' not really scalable, difficult to detect the main message |
| 52 | |
| 53 | '''Better approach:''' |
| 54 | 1. analyse text on several levels |
| 55 | * whole document (sections, paragraphs, consistency) |
| 56 | * sequence of sentences (each having a structure) |
| 57 | * bag of words and keywords (in different forms, synonyms, abbreviations etc.) |
| 58 | 1. generate a summary |
| 59 | |
| 60 | |
| 61 | == Opinion mining == |
| 62 | |
| 63 | The ''iPhone 5'' price was predictably high and continues to be so, so consumers will need to bear that in mind too when looking for their next smartphone. |
| 64 | |
| 65 | ... |
| 66 | |
| 67 | Well, all of those picking up the iPhone 5 will have the same reaction: this thing is amazingly light. You’ve probably heard the |
| 68 | numbers by now (20 per cent lighter than the predecessor, as well as beating most of the opposition too at 112g.) |
| 69 | |
| 70 | |
| 71 | => |
| 72 | |
| 73 | |
| 74 | The ''iPhone 5'' '''price''' was predictably '''high''' and continues to be so, so consumers will need to bear that in mind too when looking for their next smartphone. |
| 75 | |
| 76 | ... |
| 77 | |
| 78 | Well, all of those picking up the '''iPhone 5''' will have the same reaction: this thing is '''amazingly light'''. You’ve probably heard the numbers by now ('''20 per cent lighter''' than the predecessor, as well as beating most of the opposition too at 112g.) |
| 79 | |
| 80 | |
| 81 | [[Image(/trac/research/raw-attachment/wiki/en/WillComputerUnderstand/okko.png)]] |
| 82 | |
| 83 | |
| 84 | '''Use case:''' what are people thinking about a particular product/company/idea X? |
| 85 | |
| 86 | '''Solution:''' search X, find evaluative words |
| 87 | |
| 88 | '''But:''' opinions are expressed by non-evaluative words |
| 89 | |
| 90 | '''Better solution:''' |
| 91 | * extract useful attributes of X (noise, weight, price, appearance) |
| 92 | * generate thesauri of evaluative words: thin iPhone 5 × thin tasteless burger |
| 93 | |
| 94 | |
| 95 | == Question answering == |
| 96 | |
| 97 | Do you have a bike for a 4-year-old girl? |
| 98 | '''Search results for “bike”, “girl”''' |
| 99 | |
| 100 | ... |
| 101 | |
| 102 | --- |
| 103 | |
| 104 | Do you have a bike for a 4-year-old girl? |
| 105 | |
| 106 | '''If she is under 110 cm tall I will recommend Maggie, Princess or Misty. If she is taller I would recommend Miss B or Kellie. If she does not insist on bike for girls I would also recommend Racer or Mr. Lightning. How tall is she?''' |
| 107 | |
| 108 | About 105 cm. |
| 109 | |
| 110 | '''Do you have some other constraints?''' |
| 111 | |
| 112 | I look for something cheaper. |
| 113 | |
| 114 | '''Then I would recommend Princess. It is a popular bike.''' |
| 115 | |
| 116 | --- |
| 117 | |
| 118 | '''Use case:''' chatbot providing basic support |
| 119 | |
| 120 | '''Solution:''' patterns, keyword detection, searching |
| 121 | |
| 122 | '''But:''' no real dialogue, no real answers, just searching |
| 123 | |
| 124 | '''Better solution:''' sentence structure analysis, keyword detection, coreference resolution, dialogue strategy |
| 125 | |
| 126 | [[Image(/trac/research/raw-attachment/wiki/en/WillComputerUnderstand/bike.png)]] |
| 127 | |
| 128 | |
| 129 | == Conclusions: Understanding of ''understanding'' == |
| 130 | |
| 131 | Is this real understanding? |
| 132 | |
| 133 | Probably not. |
| 134 | |
| 135 | We do not know what understanding is but we know how it looks like when someone understands. |
| 136 | |
| 137 | Computer programs that can discover a vulgar text, summarize a text, recognize someone’s feelings or answer questions |
| 138 | look like they understand our language... |