== The Synt parser == The '''Synt''' is a tool for automatic syntactic analysis designed for morphologically-rich languages, primarily Czech and Slovak. Synt parser is based on a context-free backbone enhanced with contextual actions and performs a stochastic agenda-based head-driven chart analysis. The input for Synt parser is morphologically annotated sentence in vertical form. The output of Synt parser are: a phrase-structure tree, a dependency graph and a set of syntactic structures. == Downloads == A development version can be downloaded here: ​http://nlp.fi.muni.cz/projekty/synt/synt.tar.xz == Synt grammar specification == In synt parser we use meta-grammar concept with tree grammar forms. Synt parser is based on a context-free backbone enhanced with contextual actions and performs a stochastic agenda-based head-driven chart analysis. '''Synt meta-grammar'''[[BR]] In synt parser we use tree grammar forms denotated as G1, G2 and G3. The G1 meta-grammar form is designed for human experts. The meta-grammar form contains high-level generative constructs that reflect natural language phenomena (like eword order constraints). The meta-grammar form is base for G2 grammar form where the meta-grammar rules are expanded.[[BR]] The G2 grammar form consists of context free rules with feature agreement tests and other contextual actions.[[BR]] The G3 grammar form consist of standard rules of the expanded grammar with the actions remaining to garantee the contextual requirements. '''The G1 meta-grammar form'''[[BR]] The meta-grammar consists of global order constraints that provide succession of given terminals. Meta-grammar contains special flags that impose partial restrictions to given non-terminals and terminals on the right side of the rule. In grammar rules are used different arrow marks (->, -->, ==>, ===>), that specify rule type. The meaning of arrow form is: "the thicker and longer the arrow the more actions are able to be done in rule translation". The '->' arrow de-notates an ordinary context free grammar transcription and '===>' inserts possible integer_segment between right hand side constituents, checks the correct order of enclitics and supplies several forms of rule to make the verb phrase into a full sentence.[[BR]] [[BR]] G1 combining constructs (generates variants of given terminals and non-terminals): • order()[[BR]] • rhs()[[BR]] • first()[[BR]] '''Example''' {{{ I will ask: clause ===> order(VBU,R,VRI) }}} ''order()'': generates all possible permutations of its components ''first()'' and ''rhs()'': are employed to implant content of all the right hand side of specified non-terminal to the rule. The ''rhs(N)'' inserts all possible rewritings of non-terminal N. The resulting terms are then subject to standard constraints, enclitic checking and inter-segmentation. The ''first(N)'' secure that N is firmly tried to the beginning. Grammar contains several generative constructs starting with %list_* expression. This constructs defining rule templates, which automatically produce new rules for a list of the given non-terminals. A significant portion of the grammar is made up by verb group rules, that contains frequent repetitive constructions in given verb group. '''Example:''' {{{ %group verbP={ V: verb_rule_schema($@,"(#1)") groupflag($1,"head"), VR R: verb_rule_schema($@,"(#1 #2)") groupflag($1,"head"), } /* ctu/ptam se - I am reading/I am asking */ clause ====> order(group(verbP), vi_list) verb_rule_schema($@,"#2") depends(getgroupflag($1,"head"), $2) }}} Here, the group verbP denotes two sets of non-terminals with the corresponding actions that are substituted for the expression group(verbP) on the RHS of the clause non-terminal. ''flag(any string)'': refer to veerb group members in rules ''verb_rule_schema'': • defines the port of verb group that form a verbal object in successive logical analysis[[BR]] • appears in group and rule right hand side ''%marge_actions={verb_rule_schema}'': gather and merge arguments of actions from verb_rule_schema into one resulting actiont rule levels: express the occurrence of grammatical phenomena. The higher the level, the less frequent the appropriate grammaticalphenomena is. '''Example:''' {{{ 3: np -> adj_group propagate_case_number_gender($1) }}} Rule is of level 3. When we turn the grammar level to at least 3, we allow adjective groups to form a separate intersegment. ''head()'' and ''depends()'': allow to express the dependency links between rule items. For example depends(A,B,C) means that B and C depends on A. '''Second grammar form (G2)'''[[BR]] As we have mentioned earlier, several pre-defined grammatical tests and procedures are used in the description of context actions associated with each grammatical rule of the system. The pruning actions include: • grammatical case test for particular words and noun groups • agreement test of case in prepositional construction • agreement test of number and gender for relative pronouns • agreement test of case, number and gender for noun groups • type checking of logical constructions '''Example:''' {{{ np -> adj_group np rule_schema($@, "lwtx(awtx(#1) and awtx(#2))") rule_schema($@, "lwtx([[awt(#1),#2],x])") }}} The rule schema action presents a prescription for building a logical construction out of the sub-constructions from the right hand side. ''propagate_all'' and ''agree_*_and_propagate'': compute and propagate all relevant grammatical information from the selected non-terminals on the right hand side to the one on the left hand side of the rule. '''The Expanded Grammar Form (G3)'''[[BR]] Transform G2 form with the contextual actions into the rules. [[Image(synt.jpg)]] == Possible Synt output == '''phrase-structure tree'''[[BR]] [[Image(strom_synt_zac.png)]][[BR]][[BR]] [[BR]] '''dependency graph'''[[BR]] [[Image(graph.png)]][[BR]] [[BR]] [[BR]] '''syntactic structure'''[[BR]] {{{ [0-7) : Tlačil auto, tlačiť - V auto - N Tlačil auto, [2-4) : ktoré sa pokazilo ktorý - PRON byť - PRON pokaziť - V ktoré sa pokazilo. }}} == Commands == The input for Synt parser is morphological annotated sentence (majka for Czech and RFTagger for Slovak) in vertical format. To provide basic syntactic analysis: [[BR]] {{{ cat sentence.vert | /nlp/projekty/syntax_sk/synt_sk/synt/synt -i vertical (Slovak) cat sentence.vert | /nlp/synt/synt/synt -i vertical (Czech) }}} To provide syntactic analysis with phrase-structure tree output: [[BR]] {{{ cat sentence.vert | /nlp/projekty/syntax_sk/synt_sk/synt/synt -i vertical -tt- | /nlp/projekty/set/set/TreeViewer/TreeViewer.py (Slovak) cat sentence.vert | /nlp/synt/synt/synt -i vertical -tt- | /nlp/projekty/set/set/TreeViewer/TreeViewer.py(Czech) }}}