Changes between Version 2 and Version 3 of private/AdvancedNlpCourse/ParsingCzech

Oct 19, 2015, 12:44:10 AM (5 years ago)
Miloš Jakubíček



  • private/AdvancedNlpCourse/ParsingCzech

    v2 v3  
    99=== References ===
    11 Approx 3 current papers (preferably from best NLP conferences/journals, eg. [[|ACL Anthology]]) that will be used as a source for the one-hour lecture:
    13  1. paper1
    14  1. paper2
    15  1. paper3
     11 1. PEI, Wenzhe; GE, Tao; CHANG, Baobao. An effective neural network model for graph-based dependency parsing. In: Proc. of ACL. 2015.
     12 1. CHOI, Jinho D.; TETREAULT, Joel; STENT, Amanda. It Depends: Dependency Parser Comparison Using A Web-based Evaluation Tool. In: Proc. of ACL. 2015.
     13 1. DURRETT, Greg; KLEIN, Dan. Neural CRF Parsing. In: Proc. of ACL. 2015.
    1715== Practical Session ==
    19 Concrete description of work assignment for students for the second one-hour part of the lecture. The work will consist of tasks connected with practical implementations of algorithms connected with the current topic (probably not the state-of-the-art algorithms mentioned in the first part) and with real data. Students can test the algorithms, evaluate them and possibly try some short adaptations for various subtasks.
     17 1. Go to, login and create a shadow copy of the Czech Wikipedia corpus by clicking on "Create grammar development corpus".
     18 1. Develop your own sketch grammar that will capture the following semantic relations in this corpus: hypernymy/hyponymy, meronymy/holonymy (hint: use {{{DUAL}}} directive), optionally you can develop more relations (e.g. "is-defined-as").
     19    Read related [ documentation]. Start with a couple of simple CQL queries that you pretest in the interface.
     20 1. You can iteratively expand the grammar, upload it into the system, have the system compute word sketches and review the results
     21 1. When you are happy with the grammar, logon to the {{{}}} server and use the {{{dumpws}}} command to export the content of the word sketch database:
    21 Students can also be required to generate some results of their work and hand them in to prove completing the tasks.
     23    {{{dumpws /corpora/ca/user_data/<YOUR_USERNAME_IN_SKETCH_ENGINE>/registry/<YOUR_CORPUS_ID>}}}
     24 5. Process the output of {{{dumpws}}} with a simple Bash or Python script to select first 100 most salient headword-collocation pairs for each relation. Upload the resulting list into the IS.