wiki:WordNetFormat

Context Navigation

close Warning: Can't synchronize with repository "(default)" (/usr/local/svn/deb2-client does not appear to be a Subversion repository.). Look in the Trac log for more information.

Version 4 (modified by Adam, 17 years ago) (diff)
--

VisDic XML format

XML files in VisDic consist of tags and their values. Value of the TAG tag is enclosed in strings <TAG> and </TAG>. Tags can be nested, which means that each tag can contain another tag. White characters like spaces, tabs and new-lines at the start or at the end of each tag value are trimmed. However, XML files parsed by VisDic are quite different from the common ones in these points:

XML dictionaries contain entries. Each entry is in fact represented by one small XML file. There is no tag enclosing the whole dictionary.
XML tag has no attributes.

Example of one Wordnet synset:

<SYNSET>
 <ID>ENG21-00001740-n</ID>
 <POS>n</POS>
 <SYNONYM>
  <LITERAL>entity<SENSE>1</SENSE></LITERAL>
 </SYNONYM>
 <DEF>that which is perceived or known or inferred to have its own distinct existence (living or nonliving)</DEF>
 <BCS>2</BCS>
 <DOMAIN>factotum</DOMAIN>
</SYNSET>

Tags and their values:

ID: unique synset identification
POS: Part of speech (n=noun, v=verb, a=adjective, b=adverb)
SYNONYM: synonyms
LITERAL: one literal
SENSE: sense number
DEF: definition
BCS: Common Base Concepts set number
DOMAIN: synset domain

DEBVisDic XML format

DEBVisDic XML is almost the same as VisDic XML. Only difference is that literal sense number is converted from tags to attributes. For example:

 <SYNONYM>
  <LITERAL sense="1">entity</LITERAL>
 </SYNONYM>

You can convert XML in VisDic format to DEBVisDic format using this XSLT template http://deb.fi.muni.cz/vis2deb.xslt

Download in other formats:

Plain Text