Context Navigation

Changes between Version 7 and Version 8 of ast

Timestamp:: Mar 15, 2019, 11:52:46 AM (6 years ago)
Author:: Ales Horak
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

ast

-                      v7
+                      v8
 = Automatic Semantic Tool (AST) =
+Full semantic analysis of natural language (NL) texts is an open
+problem. The most comprehensive semantic systems build upon a mathematically
+sound formalism of a selected logical system. Mostly due to computability
+and efficiency, current systems work with the first order logic (or its variant).
+However, the low-order logic is not appropriate for capturing higher-order
+phenomena that occurs in natural language, such as belief attitudes, direct
+speech, or verb tenses. In our project, we develop new tool for automatic semantic analysis
+(AST) that emerged from (a module of) the Czech syntactic parser [https://nlp.fi.muni.cz/trac/synt SYNT] .
+AST is now a standalone tool  based on Transparent Intensional Logic (TIL).
+It works with the same input files (lexicons, semantic rules, ...) that were designed and developed in SYNT.
+AST can provide a semantic analysis in the form of Transparent Intensional Logic (TIL) constructions independently on the input syntactic parser and language.
+Adaptation for new language consists in a specification of four lexicon files that describe lexical items, verb valencies, prepositional valencies and a semantic grammar.
+== Input ==
+To create a semantic structure of a sentence, AST needs the output from
+previous analysis. A usual output is in the form of a syntactic tree.
+'''Textual form of syntactic tree:'''
+{{{
+<tree>
+{##start##
+  {start
+    {ss
+      {clause
+        {VL<leaf><idx>0</idx><w>Jedl</w>
+         <l>jíst</l><c>k5eAaIgMnS</c></leaf>}
+        {intr
+          {adjp
+            {ADJ<leaf><idx>1</idx><w>pečené</w>
+             <l>pečený</l><c>k2eAgNnSc4</c></leaf>}
+          }
+          {np
+            {N<leaf><idx>2</idx><w>kuře</w>
+             <l>kuře</l><c>k1gNnSc4</c></leaf>}
+          }
+        }
+      }
+    }
+    {ends
+      {'.'<leaf><idx>3</idx><w>.</w><l>.</l><c>kX</c></leaf> }
+    }
+  }
+}
+</tree>
+}}}
+'''Corresponding graphical representation:'''
+[[Image(tree.png, 700px)]]
+Besides the tree nodes and edges, the tree contains morphological information about each word: a lemma and a PoS tag, which are used by AST for
+deriving implicit out-of-vocabulary type information.
+Actual AST implementation is ale to process inputs form [https://nlp.fi.muni.cz/trac/synt SYNT] ad [https://nlp.fi.muni.cz/trac/set SET] parsers. The previous example of syntactic tree is from output of SYNT parser.
+The example of SET tree in textual form  for sentence "Tom wants to buy a new car but he will not buy it.":
+{{{
+id word:nterm lemma tag pid til schema
+N:Tom Tom k1gMnSc1;ca14 p
+V:chce chtít k5eAaImIp3nS 15
+V:koupit koupit k5eAaPmF 16
+ADJ:nové nový k2eAgNnSc4d1 17
+N:auto auto k1gNnSc4 17
+PUNCT:, , kIx 10
+CONJ:ale ale k8xC 10
+V:nekoupí koupit k5eNaPmIp3nS 13
+PRON:je on k3xPp3gNnSc4 13
+PUNCT:. . kIx. 10
+<CLAUSE> k5eNaPmIp3nS 12 vrule_sch ( $$ $@ )
+<CLAUSE> k5eAaImIp3nS 12 vrule_sch ( $$ $@ )
+<SENTENCE> -1
+<VP> koupit k5eNaPmIp3nS 10 vrule_sch_add ( $$ $@ "#1H (#2)" )
+<VP> chtít k5eAaImIp3nS 11 vrule_sch_add ( $$ $@ "#2H (#1)" )
+<VP> chtít k5eAaImIp3nS 14 vrule_sch_add ( $$ $@ "#1H (#2)" )
+<VP> koupit k5eAaPmF 15 vrule_sch_add ( $$ $@ "#1H (#2)" )
+<NP> auto k1gNnSc4 16 rule_sch ( $$ $@ "[#1,#2]" )
+}}}
+Visual representation of SET structural tree tree:
+[[Image(set_tree.png, 700px)]]
+== Language Dependent Files ==
+The core of AST system is universal and can be used for semantic analysis of any
+language. Besides main core the system also uses input files that are language
+dependent and that need to be modified for new language.
+'''The Semantic Grammar''': resulting semantic construction is built by
+bottom-up analysis based on the input syntactic tree provided by the syntactic
+parser and by a semantic extension of the actual grammar used in the parsing
+process. To know which rule was used by the parser, AST needs the semantic
+grammar file. This file contains specification of semantic actions that need
+to be done before propagation of particular node constructions to the higher
+level in the syntactic tree. The semantic actions define what logical functions
+correspond to each particular syntactic rule. For instance, the <np> node in
+graphical representation corresponds to the rule and action:
+{{{
+np -> left_modif np
+rule_schema ( "[#1,#2]" )
+}}}
+which says that the resulting logical construction of the left-hand side np is
+obtained as a (logical) application of the left_modif (sub)construction to the
+right-hand side np (sub)construction. Example of building construction from two subconstructions is presnet in following example:
+[[Image(analysis.png, 700px)]]
+'''TIL Types of Lexical Items''': the second language dependent file defines lexical
+items and their TIL types. The types are hierarchically built from four simple
+TIL types:
+* o: representing the truth-values,
+* ι: class of individuals,
+* τ: class of time moments, and
+* ω: class of possible worlds.
+AST contains rules for deriving implicit types based on PoS tags of the input
+words, so as the lexicons must prescribe the type only for cases that differ from
+the implicit definition. A lexical item example for the verb "jíst" (eat) is:
+[[Image(jist.png, 300px)]]
+The exact format of the lexical item in the input file is as follows: the lemma
+starts on a separate line. After the lemma there is a list of lines where an
+(optional) POS tag filter precedes the resulting object schema (here otriv, i.e.
+o-trivialisation) and TIL type (here verbal object with one ι-argument).
+'''Verb Valencies''': the next language dependent file is a file that defines verb
+valencies and schema and type information for building the resulting construction from the corresponding valency frame. An example for the verb “jíst” (eat)
+is as follows:
+{{{
+jíst
+hPTc4 :exists:V(v):V(v):and:V(v)=[[#0,try(#1)],V(w)]
+}}}
+This record defines the valency of <somebody> eats <something>, given by the
+brief valency frame hPTc4 of the object (an animate or inanimate noun phrase in
+accusative), and the resulting construction of the verbal object (V(v)) derived as
+an application of the verb (!#0) to its argument (the sentence object) with possible
+extensification (try(!#1)) and the appropriate possible world variable (V(w)).
+'''Prepositional Valency Expressions''': the last file that has to be specified for
+each language is a list of semantic mappings of prepositional phrases to
+valency expressions based on the head preposition. The file contains for each
+combination of a preposition and a grammatical case of the included noun
+phrase all possible valency slots corresponding to the prepositional phrase. For
+instance, the record for the preposition "k" (to) is displayed as
+{{{
+k
+hA hH
+}}}
+saying that "k" can introduce prepositional phrase of a where-to direction hA
+(e.g. "k lesu" – "to a forest"), or a modal how/what specification hH (e.g. "k večeři"
+– "to a dinner").
+= System Parts =
+The AST system is implemented in the Python 2.7 programming language and
+consists of six main parts:
+* the input parser: reads standard input, extracts tree structures and creates tree object for each tree from input,
+* the grammar parser: reads the grammar file and assigns a grammar rule and appropriate actions to each node inside the tree,
+* the lexical item parser: reads the file with lexical item schemata and TIL types and assigns the type to each leaf in the tree structure,
+* the schema parser: according to a logical construction schema coming with a semantic action, this module creates a construction from sub-constructions,
+* the verb valency parser: picks up the correct valency for given sentence and triggers the schema parser on sub-constructions according to the schema coming with the valency, and
+* the prepositional valency expression parser: reads the possible valency expressions assigned to prepositional phrases used as (optional) valency slots in the actual sentence valency frame.
+* the sentence schema processor: if the sentence structure contains subordination or coordination clauses the sentence schema parser is triggered. The
+sentence schemata are classified by the conjunctions used between clauses.
+= Download =
+You can download AST tool [[attachment:ast_til.tar.xz|here]]
+Move to the [https://nlp.fi.muni.cz/trac/synt/wiki/Ast synt trac].