= Natural Language Inference = == Annotation manual == [[Image(annotation.png,width=50%,right)]] The annotated sentences are descriptions of photographs. You cannot see the photos; however, you can imagine their content. The individual sentences are machine-translated from English to Czech. The aim is to decide, for each pair of sentences (premise and hypothesis) belonging to a single photo, whether: * the hypothesis **follows** from the premise (**entailment**) * the hypothesis **cannot follow** from the premise (**contradiction**) * the hypothesis **may or may not follow** from the premise (**neutral**) * one or both sentences from the pair **do not make sense** (**bad translation**) The annotation is made via the [https://nlp.fi.muni.cz/projekty/labelstudio LabelStudio] tool. The procedure requires a double login process. First, log into the tool using the credentials obtained by email (`student@localhost`). Next, log into the assigned data packages using your faculty credentials (`x....` and your password). === How to annotate one sample === !LabelStudio limitation: Do not simultaneously open one labeling task in two tabs or windows. More details about the labeling process: https://labelstud.io/guide/labeling.html The annotation steps are the following. ==== 1. Rule out bad translations Read the premise and the hypothesis and decide whether they make sense. If not, select **bad translation** and continue to another pair. You can check your understanding by revealing the original texts. ==== 2. Decide the sentence relationship If the sentences make sense, decide whether the relation between them is **entailment**, **contradiction**, or **neutral**. Please assume that the hypothesis and the premise are connected by the photograph content. For example, if the premise says something about a man and the hypothesis mentions a human, assume they are **the same entity** in the picture. Also, use **common sense** when judging the impossibility of ''following''. For example, if the premise says "''a man rides a bicycle''", and the hypothesis says "''the man is washing his hands''", annotate the pair as **contradiction**, even though you ''can'' imagine someone washing his hands while riding a bike (being a very uncommon situation). Typically: 1. If the hypothesis is a more general statement than the premise, the result is **entailment**. 1. If the premise adds new information or denotes an element of a hypothesis by a more specific name, it is **neutral** (cannot say). 1. If the hypothesis and the premise describe a non-related action with different actors or they describe mutually exclusive actions, it is a **contradiction**. ==== 3. Select relevant parts ==== [[Image(entailment1.png,width=50%,right)]] After deciding the sentence relationship, support your selection by selecting the relevant part in the premise, hypothesis, or both. Click on the **relevant** button under the premise or hypothesis. Mark one or more words (the tool selects whole words and removes trailing spaces from the selection). In case of **entailment**, mark the elements that are the same thing but are described with different words. In case of **contradiction**, mark the elements that are mutually exclusive (e.g., man-woman) in the premise and hypothesis. If the hypothesis and the premise are completely unrelated, do not mark anything. In case of **neutral**, there is often some new information in the hypothesis, for instance, a more specific term (woman-human) or some independent term. In the neutral pair, the hypothesis may have one element with no relation to the premise. ==== 4. Create a supporting relation ==== Set up the relation between the elements in the premise and hypothesis: 1. Click the relevant element in premise (1). 1. select the **Info** tab and click the **Create relation** button, a chain icon (2). 1. Next, click the relevant part in the hypothesis (3). [[Image(entailment2.png,width=50%,center)]] [[Image(entailment3.png,width=50%,center)]] Then select the relation type: click on the **Relations** tab, and select the relation by clicking the triple dot button. Next, click into the **Select labels** box and choose a relation type: * In the case of **entailment**, the relation would be green, i.e., generalization or similar. * In the case of **contradiction**, the relation would be red, i.e., exclusion. * In the case of **neutral**, the relation would be orange. It does not matter if you mark prepositional phrases (e.g., "''v obýváku''") or noun phrases (e.g., "''obýváku''"). [[Image(entailment4.png,width=15%,center)]] === Examples of annotations === ==== Example 1 ==== **Premise:** ''Vousatý muž v malířské čepici hraje na xylofon na straně rušné lávky.'' **Hypothesis:** ''Muž s plnovousem hraje na nástroj na chodníku.'' The information in the hypothesis is more general, so mark it as **entailment**. For support of the decision, mark "xylofon" and "nástroj" and connect with the **generalization** relation, and "Vousatý muž" and "Muž s plnovousem" with the **similar** relation. ==== Example 2 ==== **Premise:** ''Chlapec je zajištěn mezi dvěma bungee lany a visí ve vzduchu.'' **Hypothesis:** ''Chlapec se prochází v parku.'' The boy cannot do both things. Mark as **contradiction**. To support the decision, mark "visí" (is suspended) and "se prochází" (is walking) as **exclusion**. ==== Example 3 ==== **Premise:** ''Surfař v zeleném pruhovaném neoprenu jede na vlně.'' **Hypothesis:** ''Muž, který surfuje, má zelený pruhovaný neopren se svým psem.'' The information about the dog is more specific, therefore mark as **neutral**. To support the decision, mark "se svým psem" without relation to the premise. ==== Example 4 ==== **Premise:** ''Dvě dámy se střetnou se třemi osly.'' **Hypothesis:** ''Dvě dámy půjdou nakrmit tři osly do zoo.'' We cannot tell whether the ladies will feed the donkeys if we only know they meet the donkeys. Mark as **neutral**. For support of the decision, mark "se střetnou" (meet) and "půjdou nakrmit" (will feed) as **independence**. ==== Example 5 ==== **Premise:** ''Starší muž hrající na podivný nástroj podobný kytaře na něčem, co vypadá jako park nebo otevřené prostranství.'' **Hypothesis:** ''Žena hraje v obýváku na klavír.'' Since a man is not a woman, mark it as **contradiction**. To support the decision, mark "Starší muž" and "Žena" with the **exclusion** relation. Mark also "park nebo otevřené prostranství" and "obýváku" with the **exclusion** relation. ==== Example 6 ==== **Premise:** ''Kůň vlevo má bílou hřívu.'' **Hypothesis:** ''Je tam víc než jeden kůň.'' From the premise, it follows that there are more horses. Mark the relation as **entailment** and connect "Kůň vlevo" (the left horse) and "víc než jeden kůň" (more than one horse) with the **generalization** relation. ==== Example 7 ==== **Premise:** ''Několik lidí pohybujících nějakou stavbou.'' **Hypothesis:** ''Několik lidí stěhuje kůlnu na nářadí do zadní části dvora.'' We don't know the nature of the structure. So, mark it as **neutral**. To support the decision, mark "nějakou stavbou" and "kůlnu na nářadí" with the **specification** relation. ==== Example 8 ==== **Premise:** ''Ta mladá žena s šátkem dává někomu dárek.'' **Hypothesis:** ''Současnost je malá.'' Apparently, the two sentences have no relation; moreover, the hypothesis is suspicious. Check the English origin (The young woman with the scarf is giving a present to someone. The present is small.) and mark it as **bad translation**. ==== Example 9 ==== **Premise:** ''Žena a dvě malé holčičky slaví narozeniny s mužem přes webovou kameru.'' **Hypothesis:** ''Muž běží po ulici'' Completely unrelated stories. Annotated as **contradiction** and do not mark any relations. === Submit === Finally, **Submit** your annotation. [[Image(annotation_submit.png,width=50%,center)]] === Further information === You can check the instructions by using the **Instructions** button on the top when you need it. [[Image(annotation_instruction.png,width=50%,center)]] You can try a short [https://nlp.fi.muni.cz/trac/research/wiki/AnnotationManual/CZSNLI/training training] on your own.