Changes between Version 4 and Version 5 of WikiStart
- Timestamp:
- Feb 19, 2014, 12:01:26 PM (10 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
WikiStart
v4 v5 16 16 17 17 == Synt grammar specification == 18 In synt parser we use meta-grammar concept with tree grammar forms. Synt parser is based on a context-free backbone enhanced with contextual actions and performs a stochastic agenda-based head-driven chart analysis.18 The Synt parser uses a meta-grammar concept: to face maintenance and development issues of a wide-coverage grammar, a full grammar is automatically generated from a (hand-written) meta-grammar. Synt parser is based on a context-free backbone enhanced with contextual actions and performs a stochastic agenda-based head-driven chart analysis. 19 19 20 '''Synt meta-grammar'''[[BR]] 21 In synt parser we use tree grammar forms denotated as G1, G2 and G3. The G1 meta-grammar form is designed for human experts. The meta-grammar form contains high-level generative constructs that reflect natural language phenomena (like eword order constraints). The meta-grammar form is base for G2 grammar form where the meta-grammar rules are expanded.[[BR]] 22 The G2 grammar form consists of context free rules with feature agreement tests and other contextual actions.[[BR]] 23 The G3 grammar form consist of standard rules of the expanded grammar with the actions remaining to garantee the contextual requirements. 20 See [MetaGrammar full meta-grammar documentation] 24 21 22 == Synt output == 25 23 26 '''The G1 meta-grammar form'''[[BR]] 27 The meta-grammar consists of global order constraints that provide succession of given terminals. Meta-grammar contains special flags that impose partial restrictions to given non-terminals and terminals on the right side of the rule. 28 In grammar rules are used different arrow marks (->, -->, ==>, ===>), that specify rule type. The meaning of arrow form is: "the thicker and longer the arrow the more actions are able to be done in rule translation". The '->' arrow de-notates an ordinary context free grammar transcription and '===>' inserts possible integer_segment between right hand side constituents, checks the correct order of enclitics and supplies several forms of rule to make the verb phrase into a full sentence.[[BR]] 29 [[BR]] 24 === Phrase-structure tree === 30 25 31 G1 combining constructs (generates variants of given terminals and non-terminals): 32 • order()[[BR]] 33 • rhs()[[BR]] 34 • first()[[BR]] 35 '''Example''' 36 {{{ 37 I will ask: clause ===> order(VBU,R,VRI) 38 }}} 26 [[Image(strom_synt_zac.png)]] 39 27 40 ''order()'': generates all possible permutations of its components 28 === Dependency graph === 41 29 42 ''first()'' and ''rhs()'': are employed to implant content of all the right hand side of specified non-terminal to the rule. The ''rhs(N)'' inserts all possible rewritings of non-terminal N. The resulting terms are then subject to standard constraints, enclitic checking and inter-segmentation. The ''first(N)'' secure that N is firmly tried to the beginning. 30 [[Image(graph.png)]] 43 31 44 Grammar contains several generative constructs starting with %list_* expression. This constructs defining rule templates, which automatically produce new rules for a list of the given non-terminals. 45 46 A significant portion of the grammar is made up by verb group rules, that contains frequent repetitive constructions in given verb group. 47 48 '''Example:''' 49 50 {{{ 51 %group verbP={ 52 V: verb_rule_schema($@,"(#1)") 53 groupflag($1,"head"), 54 VR R: verb_rule_schema($@,"(#1 #2)") 55 groupflag($1,"head"), 56 } 57 /* ctu/ptam se - I am reading/I am asking */ 58 clause ====> order(group(verbP), vi_list) 59 verb_rule_schema($@,"#2") 60 depends(getgroupflag($1,"head"), $2) 61 }}} 62 63 Here, the group verbP denotes two sets of non-terminals with the corresponding actions that are substituted for the expression group(verbP) on the RHS of the clause non-terminal. 64 65 ''flag(any string)'': refer to veerb group members in rules 66 ''verb_rule_schema'': 67 • defines the port of verb group that form a verbal object in successive logical analysis[[BR]] 68 • appears in group and rule right hand side 69 ''%marge_actions={verb_rule_schema}'': gather and merge arguments of actions from verb_rule_schema into one resulting actiont 70 71 rule levels: express the occurrence of grammatical phenomena. The higher the level, the less frequent the appropriate grammaticalphenomena is. 72 73 '''Example:''' 74 75 {{{ 76 3: np -> adj_group 77 propagate_case_number_gender($1) 78 }}} 79 80 Rule is of level 3. When we turn the grammar level to at least 3, we allow adjective groups to form a separate intersegment. 81 82 ''head()'' and ''depends()'': allow to express the dependency links between rule items. For example depends(A,B,C) means that B and C depends on A. 83 84 85 '''Second grammar form (G2)'''[[BR]] 86 As we have mentioned earlier, several pre-defined grammatical tests and procedures are used in the description of context actions associated with each grammatical rule of the system. 87 88 The pruning actions include: 89 • grammatical case test for particular words and noun groups 90 • agreement test of case in prepositional construction 91 • agreement test of number and gender for relative pronouns 92 • agreement test of case, number and gender for noun groups 93 • type checking of logical constructions 94 95 '''Example:''' 96 97 {{{ 98 np -> adj_group np 99 rule_schema($@, "lwtx(awtx(#1) and awtx(#2))") 100 rule_schema($@, "lwtx([[awt(#1),#2],x])") 101 }}} 102 103 The rule schema action presents a prescription for building a logical construction out of the sub-constructions from the right hand side. 104 ''propagate_all'' and ''agree_*_and_propagate'': compute and propagate all relevant grammatical information from the selected non-terminals on the right hand side to the one on the left hand side of the rule. 105 106 107 '''The Expanded Grammar Form (G3)'''[[BR]] 108 Transform G2 form with the contextual actions into the rules. 109 110 [[Image(synt.jpg)]] 111 112 113 == Possible Synt output == 114 '''phrase-structure tree'''[[BR]] 115 116 [[Image(strom_synt_zac.png)]][[BR]][[BR]] 117 [[BR]] 118 119 '''dependency graph'''[[BR]] 120 121 [[Image(graph.png)]][[BR]] 122 [[BR]] 123 [[BR]] 124 '''syntactic structure'''[[BR]] 32 === Syntactic structure === 125 33 126 34 {{{ … … 134 42 135 43 136 == Commands == 137 The input for Synt parser is morphological annotated sentence (majka for Czech and RFTagger for Slovak) in vertical format. 138 To provide basic syntactic analysis: [[BR]] 44 == Usage == 45 46 The input for Synt parser is a morphological annotated sentence either in vertical (word-per-line) or brief format (which allows ambiguous morphological input). 47 Multiple sentences are separated by a blank line. 48 49 To provide basic syntactic analysis: 139 50 140 51 {{{ 141 cat sentence.vert | /nlp/projekty/syntax_sk/synt_sk/synt/synt -i vertical (Slovak)142 cat sentence. vert | /nlp/synt/synt/synt -i vertical (Czech)52 cat sentence.vert | synt -i vertical 53 cat sentence.brief | synt -i brief 143 54 }}} 144 To provide syntactic analysis with phrase-structure tree output: [[BR]] 55 56 To provide syntactic analysis with phrase-structure tree output: 145 57 146 58 {{{ 147 cat sentence.vert | /nlp/projekty/syntax_sk/synt_sk/synt/synt -i vertical -tt- | /nlp/projekty/set/set/TreeViewer/TreeViewer.py (Slovak) 148 cat sentence.vert | /nlp/synt/synt/synt -i vertical -tt- | /nlp/projekty/set/set/TreeViewer/TreeViewer.py(Czech) 59 cat sentence.vert | synt -i vertical -tt- 149 60 }}}