Changes between Version 3 and Version 4 of ast


Ignore:
Timestamp:
Mar 14, 2019, 4:05:41 PM (5 years ago)
Author:
xmedved1
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ast

    v3 v4  
    5858
    5959Besides the tree nodes and edges, the tree contains morphological information about each word: a lemma and a PoS tag, which are used by AST for
    60 deriving implicit out-of-vocabulary type information
     60deriving implicit out-of-vocabulary type information.
     61
     62Actual AST implementation is ale to process inputs form [https://nlp.fi.muni.cz/trac/synt SYNT] ad [https://nlp.fi.muni.cz/trac/set SET] parsers. The previous example of syntactic tree is from output of SYNT parser.
     63
     64The example of SET tree in textual form  for sentence "Tom wants to buy a new car but he will not buy it.":
     65
     66{{{
     67id word:nterm lemma tag pid til schema
     680 N:Tom Tom k1gMnSc1;ca14 p
     691 V:chce chtít k5eAaImIp3nS 15
     702 V:koupit koupit k5eAaPmF 16
     713 ADJ:nové nový k2eAgNnSc4d1 17
     724 N:auto auto k1gNnSc4 17
     735 PUNCT:, , kIx 10
     746 CONJ:ale ale k8xC 10
     757 V:nekoupí koupit k5eNaPmIp3nS 13
     768 PRON:je on k3xPp3gNnSc4 13
     779 PUNCT:. . kIx. 10
     7810 <CLAUSE> k5eNaPmIp3nS 12 vrule_sch ( $$ $@ )
     7911 <CLAUSE> k5eAaImIp3nS 12 vrule_sch ( $$ $@ )
     8012 <SENTENCE> -1
     8113 <VP> koupit k5eNaPmIp3nS 10 vrule_sch_add ( $$ $@ "#1H (#2)" )
     8214 <VP> chtít k5eAaImIp3nS 11 vrule_sch_add ( $$ $@ "#2H (#1)" )
     8315 <VP> chtít k5eAaImIp3nS 14 vrule_sch_add ( $$ $@ "#1H (#2)" )
     8416 <VP> koupit k5eAaPmF 15 vrule_sch_add ( $$ $@ "#1H (#2)" )
     8517 <NP> auto k1gNnSc4 16 rule_sch ( $$ $@ "[#1,#2]" )
     86}}}
     87
     88Visual representation of SET structural tree tree:
     89
     90[[Image(set_tree.png, 700px)]]
     91
    6192
    6293
     
    101132the implicit definition. A lexical item example for the verb "jíst" (eat) is:
    102133
    103 [[Image(jist.png, 700px)]]
     134[[Image(jist.png, 300px)]]
    104135
    105136The exact format of the lexical item in the input file is as follows: the lemma
     
    107138(optional) POS tag filter precedes the resulting object schema (here otriv, i.e.
    108139o-trivialisation) and TIL type (here verbal object with one ι-argument).
     140
     141'''Verb Valencies''': the next language dependent file is a file that defines verb
     142valencies and schema and type information for building the resulting construction from the corresponding valency frame. An example for the verb “jíst” (eat)
     143is as follows:
     144
     145{{{
     146jíst
     147hPTc4 :exists:V(v):V(v):and:V(v)=[[#0,try(#1)],V(w)]
     148}}}
     149
     150This record defines the valency of <somebody> eats <something>, given by the
     151brief valency frame hPTc4 of the object (an animate or inanimate noun phrase in
     152accusative), and the resulting construction of the verbal object (V(v)) derived as
     153an application of the verb (!#0) to its argument (the sentence object) with possible
     154extensification (try(!#1)) and the appropriate possible world variable (V(w)).
     155
     156'''Prepositional Valency Expressions''': the last file that has to be specified for
     157each language is a list of semantic mappings of prepositional phrases to
     158valency expressions based on the head preposition. The file contains for each
     159combination of a preposition and a grammatical case of the included noun
     160phrase all possible valency slots corresponding to the prepositional phrase. For
     161instance, the record for the preposition "k" (to) is displayed as
     162
     163{{{
     164k
     1653 hA hH
     166}}}
     167
     168saying that "k" can introduce prepositional phrase of a where-to direction hA
     169(e.g. "k lesu" – "to a forest"), or a modal how/what specification hH (e.g. "k večeři"
     170– "to a dinner").
     171
     172= System Parts =
     173The AST system is implemented in the Python 2.7 programming language and
     174consists of six main parts:
     175* the input parser: reads standard input, extracts tree structures and creates tree object for each tree from input,
     176* the grammar parser: reads the grammar file and assigns a grammar rule and appropriate actions to each node inside the tree,
     177* the lexical item parser: reads the file with lexical item schemata and TIL types and assigns the type to each leaf in the tree structure,
     178* the schema parser: according to a logical construction schema coming with a semantic action, this module creates a construction from sub-constructions,
     179* the verb valency parser: picks up the correct valency for given sentence and triggers the schema parser on sub-constructions according to the schema coming with the valency, and
     180* the prepositional valency expression parser: reads the possible valency expressions assigned to prepositional phrases used as (optional) valency slots in the actual sentence valency frame.
     181
     182