wiki:MetaGrammar

Synt Meta-grammar

The grammar exists in the 3 forms denoted as G1, G2 and G3. The G1 meta-grammar form is designed for human experts. The meta-grammar form contains high-level generative constructs that reflect natural language phenomena (like word order constraints). From the G1 meta-grammar, the G2 grammar is automatically generated by expanding the meta-grammar rules.
The G2 grammar form consists of context free rules with feature agreement tests and other contextual actions, this grammar is directly used by the Synt parser.
The G3 grammar consists is generated from the G2 grammar by expanding the feature agreement tests. This grammar form might be used only if there are no other contextual actions, hence it is not used anymore for Czech and Slovak.

The G1 meta-grammar form

The meta-grammar consists of global order constraints that provide succession of given terminals. Meta-grammar contains special flags that impose partial restrictions to given non-terminals and terminals on the right side of the rule.

Flags

We have two types of flags: head and reflexivity. And groupflag and getgroupflag functions.

Arrow Types

Different arrow marks (->, -->, ==>, ===>) are used in the grammar rules for specifying the rule type. The meaning of the arrows is as follows: "the thicker and longer the arrow the more actions are performed during rule translation":

  • '->' arrow denotes an ordinary context free grammar transcription
  • '=>' checks the correct order of enclitics (all next arrows containing '=' have same meaning)
  • '-->' inserts possible inter_segment between right hand side constituents, not before firs and after last element
  • '--->' inserts possible inter_segment between right hand side constituents, also before first and after last
  • '---->' inserts possible inter_segment between right hand side constituents, also before first and after last plus insert CONJENCL ("ale" or "tehdy") and CONJ (conjunction) at the begin


Combining constructs

G1 combining constructs (generates variants of given terminals and non-terminals):

  • order(): generates all possible permutations of its components
  • rhs(): inserts all possible rewritings of non-terminal N, the resulting terms are then subject to standard constraints, enclitic checking and inter-segmentation.
  • first(): ensures that N is firmly positioned as first non-terminal in the rule

Example

I will ask:  clause ===> order(VBU,R,VRI)

List generation

The grammar contains several generative constructs starting with %list_* expression. This construct defines rule templates which automatically produce new rules for a list of the given non-terminals.

There are several functions:

  • %list_coord - provides coordination on non-terminals
  • %list_nocoord - provides no coordination on non-terminals
  • %list_coord_case_number_gender - provides coordination of case, number and gender on non-terminals. The case, number and gender can be used respectively.
  • %list_coord_propagate_plural - provides coordination on non-terminals and propagates plural. Can be used with case, number and gender.

Rule groups

A significant portion of the grammar is made up by verb group rules that contain frequent repetitive constructions in the given verb group.

Example:

%group verbP={
  V:     groupflag($1,"head"),
  VR R:  groupflag($1,"head"),
}

/* ctu/ptam se - I am reading/I am asking */
  clause ====> order(group(verbP), vi_list)
  depends(getgroupflag($1,"head"), $2)

Here, the verbP group denotes two sets of non-terminals with the corresponding actions that are substituted for the expression group(verbP) on the RHS of the clause non-terminal.

Rule levels

Rule levels represent a complementary mechanism to rule probabilities. While rule probabilities are automatically calculated from a training treebank and hence account for statistical characteristics of the rules, rule levels are manually estimated and are used for expressing mutual exclusion among rules: rules at lower level have precedence before rules at higher levels.

Example:

3: np -> adj_group
   propagate_case_number_gender($1)

Rule is of level 3. When we turn the grammar level to at least 3, we allow adjective groups to form a separate inter-segment.

Actions

Contextual actions

  • agree_case_and_propagate() - check the case and propagate the gender, number, case and pattern
  • agree_case_and_propagate_semantic() - check the case and propagate semantic
  • agree_case_and_propagate_plural() - check the case and propagate plural
  • agree_number_gender_and_propagate() - check the number and gender and propagate gender, number, case and pattern
  • agree_poss_number_gender_and_propagate() - check possible numbers and genders and propagate all possible genders, numbers, cases and patterns
  • agree_case_number_gender_and_propagate() - check case, number, gender and propagate gender, number, case and pattern
  • agree_case_number_gender_and_propagate_plural() - check case, number, gender and propagate plural
  • agree_pred_copula() -
  • agree_pred_subj()
  • test_nominative() - test nominative
  • test_genitive() - test genitive
  • test_dative() - test dative
  • test_instrumental() - test instrumental
  • test_third_person() - test third person (he, she, it)
  • test_singular() - test singular
  • test_plural() - test plural
  • test_comparative() - test comparative
  • propagate() - propagate gender, number, case and pattern
  • set_R() - set word as reflexive
  • test_comma() - test comma in input
  • test_deverbativum() -
  • test_adverbium_time_local()
  • test_nominative_instrumental() - test nominative and instrumental
  • test_valencies()
  • add_conjunction()
  • test_comma_bracketing()
  • clear_level()
  • propagate_nothing() - no propagation of gender, number, case, pattern

Dependency actions

  • head(): mark dependency head
  • depends(): mark dependency link between rule non-terminals, e.g.: depends(A,B,C) means that B and C depend on A.

Second grammar form (G2)

As we have mentioned earlier, several pre-defined grammatical tests and procedures are used in the description of context actions associated with each grammatical rule of the system.

The pruning actions include:

  • grammatical case test for particular words and noun groups
  • agreement test of case in prepositional construction
  • agreement test of number and gender for relative pronouns
  • agreement test of case, number and gender for noun groups

The rule schema action presents a prescription for building a logical construction out of the sub-constructions from the right hand side. propagate_all and agree_*_and_propagate: compute and propagate all relevant grammatical information from the selected non-terminals on the right hand side to the one on the left hand side of the rule.

Last modified 10 years ago Last modified on Mar 2, 2014, 11:07:35 PM