Changes between Initial Version and Version 1 of WikiStart


Ignore:
Timestamp:
Nov 4, 2013, 3:00:44 PM (11 years ago)
Author:
xmedved1
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • WikiStart

    v1 v1  
     1
     2== The Synt parser ==
     3The '''Synt''' is a tool for automatic syntactic analysis for Czech and Slovak language. Synt parser is based on a context-free backbone enhanced with contextual actions and performs a stochastic agenda-based head-driven chart analysis. The input for Synt parser is morphologically annotated sentence in vertical form. The output of Synt parser are: a phrase-structure tree, a dependency graph and a set of syntactic structures.
     4
     5
     6== Synt grammar specification ==
     7In synt parser we use meta-grammar concept with tree grammar forms. Synt parser is based on a context-free backbone enhanced with contextual actions and performs a stochastic agenda-based head-driven chart analysis.
     8
     9'''Synt meta-grammar'''[[BR]]
     10In synt parser we use tree grammar forms denotated as G1, G2 and G3. The G1 meta-grammar form is designed for human experts. The meta-grammar form contains high-level generative constructs that reflect natural language phenomena (like eword order constraints). The meta-grammar form is base for G2 grammar form where the meta-grammar rules are expanded.[[BR]]
     11The G2 grammar form consists of context free rules with feature agreement tests and other contextual actions.[[BR]]
     12The G3 grammar form consist of standard rules of the expanded grammar with the actions remaining to garantee the contextual requirements.
     13
     14
     15'''The G1 meta-grammar form'''[[BR]]
     16The meta-grammar consists of global order constraints that provide succession of given terminals. Meta-grammar contains special flags that impose partial restrictions to given non-terminals and terminals on the right side of the rule.
     17In grammar rules are used different arrow marks (->, -->, ==>, ===>), that specify rule type. The meaning of arrow form is: "the thicker and longer the arrow the more actions are able to be done in rule translation". The '->' arrow de-notates an ordinary context free grammar transcription and '===>' inserts possible integer_segment between right hand side constituents, checks the correct order of enclitics and supplies several forms of rule to make the verb phrase into a full sentence.[[BR]]
     18[[BR]]
     19
     20G1 combining constructs (generates variants of given terminals and non-terminals):
     21 • order()[[BR]]
     22 • rhs()[[BR]]
     23 • first()[[BR]]
     24'''Example'''
     25{{{
     26I will ask:  clause ===> order(VBU,R,VRI)
     27}}}
     28
     29''order()'': generates all possible permutations of its components
     30
     31''first()'' and ''rhs()'': are employed to implant content of all the right hand side of specified non-terminal to the rule. The ''rhs(N)'' inserts all possible rewritings of non-terminal N. The resulting terms are then subject to standard constraints, enclitic checking and inter-segmentation. The ''first(N)'' secure that N is firmly tried to the beginning.
     32
     33Grammar contains several generative constructs starting with %list_* expression. This constructs defining rule templates, which automatically produce new rules for a list of the given non-terminals.
     34
     35A significant portion of the grammar is made up by verb group rules, that contains frequent repetitive constructions in given verb group.
     36
     37'''Example:'''
     38
     39{{{
     40%group verbP={
     41  V:     verb_rule_schema($@,"(#1)")
     42         groupflag($1,"head"),
     43  VR R:  verb_rule_schema($@,"(#1 #2)")
     44         groupflag($1,"head"),
     45}
     46/* ctu/ptam se - I am reading/I am asking */
     47  clause ====> order(group(verbP), vi_list)
     48  verb_rule_schema($@,"#2")
     49  depends(getgroupflag($1,"head"), $2)
     50}}}
     51
     52Here, the group verbP denotes two sets of non-terminals with the corresponding actions that are substituted for the expression group(verbP) on the RHS of the clause non-terminal.
     53
     54''flag(any string)'': refer to veerb group members in rules
     55''verb_rule_schema'':
     56 • defines the port of verb group that form a verbal object in successive logical analysis[[BR]]
     57 • appears in group and rule right hand side
     58''%marge_actions={verb_rule_schema}'': gather and merge arguments of actions from verb_rule_schema into one resulting actiont
     59
     60rule levels: express the occurrence of grammatical phenomena. The higher the level, the less frequent the appropriate grammaticalphenomena is.
     61
     62'''Example:'''
     63
     64{{{
     653: np -> adj_group
     66   propagate_case_number_gender($1)
     67}}}
     68
     69Rule is of level 3. When we turn the grammar level to at least 3, we allow adjective groups to form a separate intersegment.
     70
     71''head()'' and ''depends()'': allow to express the dependency links between rule items. For example depends(A,B,C) means that B and C depends on A.
     72
     73
     74'''Second grammar form (G2)'''[[BR]]
     75As we have mentioned earlier, several pre-defined grammatical tests and procedures are used in the description of context actions associated with each grammatical rule of the system.
     76
     77The pruning actions include:
     78 • grammatical case test for particular words and noun groups
     79 • agreement test of case in prepositional construction
     80 • agreement test of number and gender for relative pronouns
     81 • agreement test of case, number and gender for noun groups
     82 • type checking of logical constructions
     83
     84'''Example:'''
     85
     86{{{
     87np -> adj_group np
     88      rule_schema($@, "lwtx(awtx(#1) and awtx(#2))")
     89      rule_schema($@, "lwtx([[awt(#1),#2],x])")
     90}}}
     91
     92The rule schema action presents a prescription for building a logical construction out of the sub-constructions from the right hand side.
     93''propagate_all'' and ''agree_*_and_propagate'': compute and propagate all relevant grammatical information from the selected non-terminals on the right hand side to the one on the left hand side of the rule.
     94
     95
     96'''The Expanded Grammar Form (G3)'''[[BR]]
     97Transform G2 form with the contextual actions into the rules.
     98
     99[[Image(http://www.fi.muni.cz/~xmedved1/synt.jpg)]]
     100
     101
     102== Possible Synt output ==
     103'''phrase-structure tree'''[[BR]]
     104
     105[[Image(http://www.fi.muni.cz/~xmedved1/strom_synt_zac.png)]][[BR]][[BR]]
     106[[BR]]
     107
     108'''dependency graph'''[[BR]]
     109
     110[[Image(http://www.fi.muni.cz/~xmedved1/graph.png)]][[BR]]
     111[[BR]]
     112[[BR]]
     113'''syntactic structure'''[[BR]]
     114
     115{{{
     116[0-7) : Tlačil      auto,
     117        tlačiť - V  auto - N
     118                    Tlačil auto,
     119[2-4) : ktoré          sa           pokazilo
     120        ktorý - PRON   byť - PRON   pokaziť - V
     121                 ktoré sa pokazilo.
     122}}}
     123
     124
     125== Commands ==
     126The input for Synt parser is morphological annotated sentence (majka for Czech and RFTagger for Slovak) in vertical format.
     127To provide basic syntactic analysis: [[BR]]
     128
     129{{{
     130cat sentence.vert | /nlp/projekty/syntax_sk/synt_sk/synt/synt -i vertical (Slovak)
     131cat sentence.vert | /nlp/synt/synt/synt -i vertical (Czech)
     132}}}
     133To provide syntactic analysis with phrase-structure tree output: [[BR]]
     134
     135{{{
     136cat sentence.vert | /nlp/projekty/syntax_sk/synt_sk/synt/synt -i vertical -tt- | /nlp/projekty/set/set/TreeViewer/TreeViewer.py (Slovak)
     137cat sentence.vert | /nlp/synt/synt/synt -i vertical -tt- | /nlp/projekty/set/set/TreeViewer/TreeViewer.py(Czech)
     138}}}