Context Navigation

← Previous Change
Wiki History
Next Change →

Changes between Initial Version and Version 1 of WikiStart

Timestamp:: Nov 4, 2013, 3:00:44 PM (12 years ago)
Author:: xmedved1
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

WikiStart

                       v1
+== The Synt parser ==
+The '''Synt''' is a tool for automatic syntactic analysis for Czech and Slovak language. Synt parser is based on a context-free backbone enhanced with contextual actions and performs a stochastic agenda-based head-driven chart analysis. The input for Synt parser is morphologically annotated sentence in vertical form. The output of Synt parser are: a phrase-structure tree, a dependency graph and a set of syntactic structures.
+== Synt grammar specification ==
+In synt parser we use meta-grammar concept with tree grammar forms. Synt parser is based on a context-free backbone enhanced with contextual actions and performs a stochastic agenda-based head-driven chart analysis.
+'''Synt meta-grammar'''[[BR]]
+In synt parser we use tree grammar forms denotated as G1, G2 and G3. The G1 meta-grammar form is designed for human experts. The meta-grammar form contains high-level generative constructs that reflect natural language phenomena (like eword order constraints). The meta-grammar form is base for G2 grammar form where the meta-grammar rules are expanded.[[BR]]
+The G2 grammar form consists of context free rules with feature agreement tests and other contextual actions.[[BR]]
+The G3 grammar form consist of standard rules of the expanded grammar with the actions remaining to garantee the contextual requirements.
+'''The G1 meta-grammar form'''[[BR]]
+The meta-grammar consists of global order constraints that provide succession of given terminals. Meta-grammar contains special flags that impose partial restrictions to given non-terminals and terminals on the right side of the rule.
+In grammar rules are used different arrow marks (->, -->, ==>, ===>), that specify rule type. The meaning of arrow form is: "the thicker and longer the arrow the more actions are able to be done in rule translation". The '->' arrow de-notates an ordinary context free grammar transcription and '===>' inserts possible integer_segment between right hand side constituents, checks the correct order of enclitics and supplies several forms of rule to make the verb phrase into a full sentence.[[BR]]
+[[BR]]
+G1 combining constructs (generates variants of given terminals and non-terminals):
+ • order()[[BR]]
+ • rhs()[[BR]]
+ • first()[[BR]]
+'''Example'''
+{{{
+I will ask:  clause ===> order(VBU,R,VRI)
+}}}
+''order()'': generates all possible permutations of its components
+''first()'' and ''rhs()'': are employed to implant content of all the right hand side of specified non-terminal to the rule. The ''rhs(N)'' inserts all possible rewritings of non-terminal N. The resulting terms are then subject to standard constraints, enclitic checking and inter-segmentation. The ''first(N)'' secure that N is firmly tried to the beginning.
+Grammar contains several generative constructs starting with %list_* expression. This constructs defining rule templates, which automatically produce new rules for a list of the given non-terminals.
+A significant portion of the grammar is made up by verb group rules, that contains frequent repetitive constructions in given verb group.
+'''Example:'''
+{{{
+%group verbP={
+  V:     verb_rule_schema($@,"(#1)")
+         groupflag($1,"head"),
+  VR R:  verb_rule_schema($@,"(#1 #2)")
+         groupflag($1,"head"),
+}
+/* ctu/ptam se - I am reading/I am asking */
+  clause ====> order(group(verbP), vi_list)
+  verb_rule_schema($@,"#2")
+  depends(getgroupflag($1,"head"), $2)
+}}}
+Here, the group verbP denotes two sets of non-terminals with the corresponding actions that are substituted for the expression group(verbP) on the RHS of the clause non-terminal.
+''flag(any string)'': refer to veerb group members in rules
+''verb_rule_schema'':
+ • defines the port of verb group that form a verbal object in successive logical analysis[[BR]]
+ • appears in group and rule right hand side
+''%marge_actions={verb_rule_schema}'': gather and merge arguments of actions from verb_rule_schema into one resulting actiont
+rule levels: express the occurrence of grammatical phenomena. The higher the level, the less frequent the appropriate grammaticalphenomena is.
+'''Example:'''
+{{{
+: np -> adj_group
+   propagate_case_number_gender($1)
+}}}
+Rule is of level 3. When we turn the grammar level to at least 3, we allow adjective groups to form a separate intersegment.
+''head()'' and ''depends()'': allow to express the dependency links between rule items. For example depends(A,B,C) means that B and C depends on A.
+'''Second grammar form (G2)'''[[BR]]
+As we have mentioned earlier, several pre-defined grammatical tests and procedures are used in the description of context actions associated with each grammatical rule of the system.
+The pruning actions include:
+ • grammatical case test for particular words and noun groups
+ • agreement test of case in prepositional construction
+ • agreement test of number and gender for relative pronouns
+ • agreement test of case, number and gender for noun groups
+ • type checking of logical constructions
+'''Example:'''
+{{{
+np -> adj_group np
+      rule_schema($@, "lwtx(awtx(#1) and awtx(#2))")
+      rule_schema($@, "lwtx([[awt(#1),#2],x])")
+}}}
+The rule schema action presents a prescription for building a logical construction out of the sub-constructions from the right hand side.
+''propagate_all'' and ''agree_*_and_propagate'': compute and propagate all relevant grammatical information from the selected non-terminals on the right hand side to the one on the left hand side of the rule.
+'''The Expanded Grammar Form (G3)'''[[BR]]
+Transform G2 form with the contextual actions into the rules.
+[[Image(http://www.fi.muni.cz/~xmedved1/synt.jpg)]]
+== Possible Synt output ==
+'''phrase-structure tree'''[[BR]]
+[[Image(http://www.fi.muni.cz/~xmedved1/strom_synt_zac.png)]][[BR]][[BR]]
+[[BR]]
+'''dependency graph'''[[BR]]
+[[Image(http://www.fi.muni.cz/~xmedved1/graph.png)]][[BR]]
+[[BR]]
+[[BR]]
+'''syntactic structure'''[[BR]]
+{{{
+[0-7) : Tlačil      auto,
+        tlačiť - V  auto - N
+                    Tlačil auto,
+[2-4) : ktoré          sa           pokazilo
+        ktorý - PRON   byť - PRON   pokaziť - V
+                 ktoré sa pokazilo.
+}}}
+== Commands ==
+The input for Synt parser is morphological annotated sentence (majka for Czech and RFTagger for Slovak) in vertical format.
+To provide basic syntactic analysis: [[BR]]
+{{{
+cat sentence.vert | /nlp/projekty/syntax_sk/synt_sk/synt/synt -i vertical (Slovak)
+cat sentence.vert | /nlp/synt/synt/synt -i vertical (Czech)
+}}}
+To provide syntactic analysis with phrase-structure tree output: [[BR]]
+{{{
+cat sentence.vert | /nlp/projekty/syntax_sk/synt_sk/synt/synt -i vertical -tt- | /nlp/projekty/set/set/TreeViewer/TreeViewer.py (Slovak)
+cat sentence.vert | /nlp/synt/synt/synt -i vertical -tt- | /nlp/projekty/set/set/TreeViewer/TreeViewer.py(Czech)
+}}}