wiki:AnnotationManual/CZSNLI

Natural Language Inference

Annotation manual

The annotated sentences are descriptions of photographs. You cannot see the photos; however, you can imagine their content. The individual sentences are machine-translated from English to Czech.

The aim is to decide, for each pair of sentences (premise and hypothesis) belonging to a single photo, whether:

  • the hypothesis follows from the premise (entailment)
  • the hypothesis cannot follow from the premise (contradiction)
  • the hypothesis may or may not follow from the premise (neutral)
  • one or both sentences from the pair do not make sense (bad translation)

The annotation is made via the LabelStudio tool. The procedure requires a double login process.

First, log into the tool using the credentials obtained by email (student@localhost).

Next, log into the assigned data packages using your faculty credentials (x.... and your password). One data package contains 1500 sentence pairs.

How to annotate one sample

LabelStudio limitation: Do not simultaneously open one labeling task in two tabs or windows.

The annotation steps are the following.

1. Rule out bad translations

Read the premise and the hypothesis and decide whether they make sense. If not, select bad translation and continue to another pair. You can check your understanding by revealing the original texts.

2. Decide the sentence relationship

If the sentences make sense, decide whether the relation between them is entailment, contradiction, or neutral.

Please assume that the hypothesis and the premise are connected by the photograph content. For example, if the premise says something about a man and the hypothesis mentions a human, assume they are the same entity in the picture.

Also, use common sense when judging the impossibility of following. For example, if the premise says "a man rides a bicycle", and the hypothesis says "the man is washing his hands", annotate the pair as contradiction, even though you can imagine someone washing his hands while riding a bike (being a very uncommon situation).

Typically:

  1. If the hypothesis is a more general statement than the premise, the result is entailment.
  2. If the premise adds new information or denotes an element of a hypothesis by a more specific name, it is neutral (cannot say).
  3. If the hypothesis and the premise describe a non-related action with different actors or they describe mutually exclusive actions, it is a contradiction.

3. Select relevant parts

After deciding the sentence relationship, support your selection by selecting the relevant part in the premise, hypothesis, or both.

Click on the relevant button under the premise or hypothesis.

Mark one or more words (the tool selects whole words and removes trailing spaces from the selection).

In case of entailment, mark the elements that are the same thing but are described with different words. If all elements are expressed with the same words then no relevant parts need to be annotated.

In case of contradiction, mark the elements that are mutually exclusive (e.g., man-woman) in the premise and hypothesis. If the hypothesis and the premise are completely unrelated, do not mark anything.

In case of neutral, there is often some new information in the hypothesis, for instance, a more specific term (woman-human) or some independent term. In the neutral pair, the hypothesis may have one element with no relation to the premise.

You can remove selection by selecting it and clicking a red garbage bin in the Info Selection details tab or just by pressing Backspace.

4. Create a supporting relation

Set up the relation between the elements in the premise and hypothesis:

  1. Click the relevant element in premise (1).
  2. select the Info tab and click the Create relation button, a chain icon (2).
  3. Next, click the relevant part in the hypothesis (3).

Then select the relation type: click on the Relations tab, and select the relation by clicking the triple dot button. Next, click into the Select labels box and choose a relation type:

  • In the case of entailment, the relation would be green, i.e., generalization or similar.
  • In the case of contradiction, the relation would be red, i.e., exclusion.
  • In the case of neutral, the relation would be orange.

It does not matter if you mark prepositional phrases (e.g., "v obýváku") or noun phrases (e.g., "obýváku").

You can remove a relation by clicking a red garbage bin in the Relations tab.

Examples of annotations

Example 1

Premise: Vousatý muž v malířské čepici hraje na xylofon na straně rušné lávky.

Hypothesis: Muž s plnovousem hraje na nástroj na chodníku.

The information in the hypothesis is more general, so mark it as entailment. For support of the decision, mark "xylofon" and "nástroj" and connect with the generalization relation, and "Vousatý muž" and "Muž s plnovousem" with the similar relation.

Example 2

Premise: Chlapec je zajištěn mezi dvěma bungee lany a visí ve vzduchu.

Hypothesis: Chlapec se prochází v parku.

The boy cannot do both things. Mark as contradiction. To support the decision, mark "visí" (is suspended) and "se prochází" (is walking) as exclusion.

Example 3

Premise: Surfař v zeleném pruhovaném neoprenu jede na vlně.

Hypothesis: Muž, který surfuje, má zelený pruhovaný neopren se svým psem.

The information about the dog is more specific, therefore mark as neutral. To support the decision, mark "se svým psem" without relation to the premise.

Example 4

Premise: Dvě dámy se střetnou se třemi osly.

Hypothesis: Dvě dámy půjdou nakrmit tři osly do zoo.

We cannot tell whether the ladies will feed the donkeys if we only know they meet the donkeys.

Mark as neutral. For support of the decision, mark "se střetnou" (meet) and "půjdou nakrmit" (will feed) as independence.

Example 5

Premise: Starší muž hrající na podivný nástroj podobný kytaře na něčem, co vypadá jako park nebo otevřené prostranství.

Hypothesis: Žena hraje v obýváku na klavír.

Since a man is not a woman, mark it as contradiction. To support the decision, mark "Starší muž" and "Žena" with the exclusion relation. Mark also "park nebo otevřené prostranství" and "obýváku" with the exclusion relation.

Example 6

Premise: Kůň vlevo má bílou hřívu.

Hypothesis: Je tam víc než jeden kůň.

From the premise, it follows that there are more horses. Mark the relation as entailment and connect "Kůň vlevo" (the left horse) and "víc než jeden kůň" (more than one horse) with the generalization relation.

Example 7

Premise: Několik lidí pohybujících nějakou stavbou.

Hypothesis: Několik lidí stěhuje kůlnu na nářadí do zadní části dvora.

We don't know the nature of the structure. So, mark it as neutral. To support the decision, mark "nějakou stavbou" and "kůlnu na nářadí" with the specification relation.

Example 8

Premise: Ta mladá žena s šátkem dává někomu dárek.

Hypothesis: Současnost je malá.

Apparently, the two sentences have no relation; moreover, the hypothesis is suspicious.

Check the English origin (The young woman with the scarf is giving a present to someone. The present is small.) and mark it as bad translation.

Example 9

Premise: Žena a dvě malé holčičky slaví narozeniny s mužem přes webovou kameru.

Hypothesis: Muž běží po ulici

Completely unrelated stories. Annotated as contradiction and do not mark any relations.

Submit

Finally, Submit your annotation.

Further information

You can check the instructions by using the Instructions button on the top when you need it.

You can try a short training on your own.

Last modified 4 weeks ago Last modified on Apr 3, 2024, 7:01:57 PM

Attachments (7)

Download all attachments as: .zip