Changes between Version 4 and Version 5 of WikiStart


Ignore:
Timestamp:
Feb 19, 2014, 12:01:26 PM (10 years ago)
Author:
Miloš Jakubíček
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • WikiStart

    v4 v5  
    1616
    1717== Synt grammar specification ==
    18 In synt parser we use meta-grammar concept with tree grammar forms. Synt parser is based on a context-free backbone enhanced with contextual actions and performs a stochastic agenda-based head-driven chart analysis.
     18The Synt parser uses a meta-grammar concept: to face maintenance and development issues of a wide-coverage grammar, a full grammar is automatically generated from a (hand-written) meta-grammar. Synt parser is based on a context-free backbone enhanced with contextual actions and performs a stochastic agenda-based head-driven chart analysis.
    1919
    20 '''Synt meta-grammar'''[[BR]]
    21 In synt parser we use tree grammar forms denotated as G1, G2 and G3. The G1 meta-grammar form is designed for human experts. The meta-grammar form contains high-level generative constructs that reflect natural language phenomena (like eword order constraints). The meta-grammar form is base for G2 grammar form where the meta-grammar rules are expanded.[[BR]]
    22 The G2 grammar form consists of context free rules with feature agreement tests and other contextual actions.[[BR]]
    23 The G3 grammar form consist of standard rules of the expanded grammar with the actions remaining to garantee the contextual requirements.
     20See [MetaGrammar full meta-grammar documentation]
    2421
     22== Synt output ==
    2523
    26 '''The G1 meta-grammar form'''[[BR]]
    27 The meta-grammar consists of global order constraints that provide succession of given terminals. Meta-grammar contains special flags that impose partial restrictions to given non-terminals and terminals on the right side of the rule.
    28 In grammar rules are used different arrow marks (->, -->, ==>, ===>), that specify rule type. The meaning of arrow form is: "the thicker and longer the arrow the more actions are able to be done in rule translation". The '->' arrow de-notates an ordinary context free grammar transcription and '===>' inserts possible integer_segment between right hand side constituents, checks the correct order of enclitics and supplies several forms of rule to make the verb phrase into a full sentence.[[BR]]
    29 [[BR]]
     24=== Phrase-structure tree ===
    3025
    31 G1 combining constructs (generates variants of given terminals and non-terminals):
    32  • order()[[BR]]
    33  • rhs()[[BR]]
    34  • first()[[BR]]
    35 '''Example'''
    36 {{{
    37 I will ask:  clause ===> order(VBU,R,VRI)
    38 }}}
     26[[Image(strom_synt_zac.png)]]
    3927
    40 ''order()'': generates all possible permutations of its components
     28=== Dependency graph ===
    4129
    42 ''first()'' and ''rhs()'': are employed to implant content of all the right hand side of specified non-terminal to the rule. The ''rhs(N)'' inserts all possible rewritings of non-terminal N. The resulting terms are then subject to standard constraints, enclitic checking and inter-segmentation. The ''first(N)'' secure that N is firmly tried to the beginning.
     30[[Image(graph.png)]]
    4331
    44 Grammar contains several generative constructs starting with %list_* expression. This constructs defining rule templates, which automatically produce new rules for a list of the given non-terminals.
    45 
    46 A significant portion of the grammar is made up by verb group rules, that contains frequent repetitive constructions in given verb group.
    47 
    48 '''Example:'''
    49 
    50 {{{
    51 %group verbP={
    52   V:     verb_rule_schema($@,"(#1)")
    53          groupflag($1,"head"),
    54   VR R:  verb_rule_schema($@,"(#1 #2)")
    55          groupflag($1,"head"),
    56 }
    57 /* ctu/ptam se - I am reading/I am asking */
    58   clause ====> order(group(verbP), vi_list)
    59   verb_rule_schema($@,"#2")
    60   depends(getgroupflag($1,"head"), $2)
    61 }}}
    62 
    63 Here, the group verbP denotes two sets of non-terminals with the corresponding actions that are substituted for the expression group(verbP) on the RHS of the clause non-terminal.
    64 
    65 ''flag(any string)'': refer to veerb group members in rules
    66 ''verb_rule_schema'':
    67  • defines the port of verb group that form a verbal object in successive logical analysis[[BR]]
    68  • appears in group and rule right hand side
    69 ''%marge_actions={verb_rule_schema}'': gather and merge arguments of actions from verb_rule_schema into one resulting actiont
    70 
    71 rule levels: express the occurrence of grammatical phenomena. The higher the level, the less frequent the appropriate grammaticalphenomena is.
    72 
    73 '''Example:'''
    74 
    75 {{{
    76 3: np -> adj_group
    77    propagate_case_number_gender($1)
    78 }}}
    79 
    80 Rule is of level 3. When we turn the grammar level to at least 3, we allow adjective groups to form a separate intersegment.
    81 
    82 ''head()'' and ''depends()'': allow to express the dependency links between rule items. For example depends(A,B,C) means that B and C depends on A.
    83 
    84 
    85 '''Second grammar form (G2)'''[[BR]]
    86 As we have mentioned earlier, several pre-defined grammatical tests and procedures are used in the description of context actions associated with each grammatical rule of the system.
    87 
    88 The pruning actions include:
    89  • grammatical case test for particular words and noun groups
    90  • agreement test of case in prepositional construction
    91  • agreement test of number and gender for relative pronouns
    92  • agreement test of case, number and gender for noun groups
    93  • type checking of logical constructions
    94 
    95 '''Example:'''
    96 
    97 {{{
    98 np -> adj_group np
    99       rule_schema($@, "lwtx(awtx(#1) and awtx(#2))")
    100       rule_schema($@, "lwtx([[awt(#1),#2],x])")
    101 }}}
    102 
    103 The rule schema action presents a prescription for building a logical construction out of the sub-constructions from the right hand side.
    104 ''propagate_all'' and ''agree_*_and_propagate'': compute and propagate all relevant grammatical information from the selected non-terminals on the right hand side to the one on the left hand side of the rule.
    105 
    106 
    107 '''The Expanded Grammar Form (G3)'''[[BR]]
    108 Transform G2 form with the contextual actions into the rules.
    109 
    110 [[Image(synt.jpg)]]
    111 
    112 
    113 == Possible Synt output ==
    114 '''phrase-structure tree'''[[BR]]
    115 
    116 [[Image(strom_synt_zac.png)]][[BR]][[BR]]
    117 [[BR]]
    118 
    119 '''dependency graph'''[[BR]]
    120 
    121 [[Image(graph.png)]][[BR]]
    122 [[BR]]
    123 [[BR]]
    124 '''syntactic structure'''[[BR]]
     32=== Syntactic structure ===
    12533
    12634{{{
     
    13442
    13543
    136 == Commands ==
    137 The input for Synt parser is morphological annotated sentence (majka for Czech and RFTagger for Slovak) in vertical format.
    138 To provide basic syntactic analysis: [[BR]]
     44== Usage ==
     45
     46The input for Synt parser is a morphological annotated sentence either in vertical (word-per-line) or brief format (which allows ambiguous morphological input).
     47Multiple sentences are separated by a blank line.
     48
     49To provide basic syntactic analysis:
    13950
    14051{{{
    141 cat sentence.vert | /nlp/projekty/syntax_sk/synt_sk/synt/synt -i vertical (Slovak)
    142 cat sentence.vert | /nlp/synt/synt/synt -i vertical (Czech)
     52cat sentence.vert | synt -i vertical
     53cat sentence.brief | synt -i brief
    14354}}}
    144 To provide syntactic analysis with phrase-structure tree output: [[BR]]
     55
     56To provide syntactic analysis with phrase-structure tree output:
    14557
    14658{{{
    147 cat sentence.vert | /nlp/projekty/syntax_sk/synt_sk/synt/synt -i vertical -tt- | /nlp/projekty/set/set/TreeViewer/TreeViewer.py (Slovak)
    148 cat sentence.vert | /nlp/synt/synt/synt -i vertical -tt- | /nlp/projekty/set/set/TreeViewer/TreeViewer.py(Czech)
     59cat sentence.vert | synt -i vertical -tt-
    14960}}}