wiki:WikiStart

Version 9 (modified by xmedved1, 10 years ago) (diff)

--

The Synt parser

The Synt is a tool for automatic syntactic analysis designed for morphologically-rich languages, primarily Czech and Slovak. Synt parser is based on a context-free backbone enhanced with contextual actions and performs a stochastic agenda-based head-driven chart analysis. The input for Synt parser is morphologically annotated sentence in vertical form. The output of Synt parser are: a phrase-structure tree, a dependency graph and a set of syntactic structures.

Downloads

A development version can be downloaded here: ​http://nlp.fi.muni.cz/projekty/synt/synt.tar.xz

Licensing

SYNT is distributed under the ​GPLv3 license.

Contact

Drop a mail to synt@…

Input

The input for Synt parser is morphologically annotated sentence in [vertical,http://nlp.fi.muni.cz/trac/synt/wiki/input] form or [brief,http://nlp.fi.muni.cz/trac/synt/wiki/input].

Synt grammar specification

The Synt parser uses a meta-grammar concept: to face maintenance and development issues of a wide-coverage grammar, a full grammar is automatically generated from a (hand-written) meta-grammar. Synt parser is based on a context-free backbone enhanced with contextual actions and performs a stochastic agenda-based head-driven chart analysis.

See full meta-grammar documentation

Synt output

See possible outputs here: https://nlp.fi.muni.cz/trac/synt/wiki/output

Phrase-structure tree

Dependency graph

Syntactic structure

An unambiguous optimal decomposition of the input sentence into a set of syntactic structures that are required by the user.

Sentence: Tlačil auto, ktoré sa pokazilo.
[0-7) : Tlačil      auto,
        tlačiť - V  auto - N
[2-4) : ktoré          sa           pokazilo
        ktorý - PRON   byť - PRON   pokaziť - V

Where [2-4) means interval of word indexes of interest. The words after the interval (ktoré sa pokazilo) are words from the sentence. And the words under the interval are lemmas of given words from sentence and abbreviation after the lemmas are part of speech.

Usage

The input for Synt parser is a morphological annotated sentence either in vertical (word-per-line) or brief format (which allows ambiguous morphological input). Multiple sentences are separated by a blank line.

To provide basic syntactic analysis:

cat sentence.vert | synt -i vertical
cat sentence.brief | synt -i brief

To provide syntactic analysis with phrase-structure tree output:

cat sentence.vert | synt -i vertical -tt-

Attachments (3)

Download all attachments as: .zip