wiki:SqadDatabase

Version 1 (modified by xpopelk, 2 months ago) (diff)

--

Simple question answering database version 3.2 (SQAD 3.2)

Description

Simple question answering database version 3.2 (SQAD v3.2) created from Czech Wikipedia. The new version consists of more than 16000 records. Each record of SQAD consists of multiple files - question, answer extraction, answer selection, URL, question metadata, and in some cases, answer context.

Example

Example of SQAD record:
Original text: Létající jaguár je novela spisovatele Josefa Formánka z roku 2004.
Question: Kdo je autorem novely Létající jaguár?
Answer: Josef Formánek
URL: http://cs.wikipedia.org/wiki/L%C3%A9taj%C3%ADc%C3%AD_jagu%C3%A1r
Question type: Person
Answer type: Person

More information about the database can be found at SQAD the project page

LINDAT handle

http://hdl.handle.net/11234/1-5019

Acknowledgements

If you use the system, please cite the related publication as well as the LINDAT/CLARIAH infrastructure: http://hdl.handle.net/11234/1-5019.

Project code: LM2018101

Project name: LINDAT/CLARIAH-CZ: Digitální výzkumná infrastruktura pro jazykové technologie, umění a humanitní vědy

Older versions

Publication info

  • HORÁK, Aleš a Marek MEDVEĎ. SQAD: Simple Question Answering Database. In Eighth Workshop on Recent Advances in Slavonic Natural Language Processing. Brno: Tribun EU, 2014. s. 121-128. ISSN 2336-4289.
  • MEDVEĎ, Marek a Aleš HORÁK. AQA: Automatic Question Answering System for Czech. In Sojka Petr, Horák Aleš, Kopeček Ivan, Pala Karel. Text, Speech, and Dialogue 19th International Conference, TSD 2016 Brno, Czech Republic, September 12–16, 2016 Proceedings. Switzerland: Springer International Publishing, 2016. s. 270-278. ISBN 978-3-319-45510-5. doi:10.1007/978-3-319-45510-5_31.
  • Marek Medveď, Radoslav Sabol, and Aleš Horák. Czech Question Answering with Extended SQAD v3.0 Benchmark Dataset. In Horák, Aleš and Rychlý, Pavel and Rambousek, Adam. Proceedings of the Thirteenth Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2019. Brno: Tribun EU, 2019. p. 99-108. ISBN 978-80-263-1530-8.
  • MEDVEĎ, Marek, Aleš HORÁK a Radoslav SABOL. Improving RNN-based Answer Selection for Morphologically Rich Languages. In Ana Rocha, Luc Steels, Jaap van den Herik. Proceedings of the 12th International Conference on Agents and Artificial Intelligence. Portugal: SCITEPRESS, 2020. s. 644-651. ISBN 978-989-758-395-7. doi:10.5220/0008979206440651.
  • MEDVEĎ, Marek, Radoslav SABOL and Aleš HORÁK. Evaluating Long Contexts in the Czech Answer Selection Task. In Horák, Rychlý, Rambousek. Recent Advances in Slavonic Natural Language Processing (RASLAN 2021). Brno: Tribun EU, 2021, p. 61-69. ISBN 978-80-263-1670-1.
  • MEDVEĎ, Marek, Aleš HORÁK a Radoslav SABOL. Comparing RNN and Transformer Context Representations in the Czech Answer Selection Task. In Ana Paula Rocha, Luc Steels, Jaap van den Herik. Proceedings of the 14th International Conference on Agents and Artificial Intelligence (ICAART). Portugal: SCITEPRESS, 2022. s. 388-394. ISBN 978-989-758-547-0. doi:10.5220/0000155600003116.

If you cite SQAD, please use this citation:

@conference{icaart22,
   author={Marek Medved. and Radoslav Sabol. and Aleš Horák.},
   title={Comparing RNN and Transformer Context Representations in the Czech Answer Selection Task},
   booktitle={Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
   year={2022},
   pages={388-394},
   publisher={SciTePress},
   organization={INSTICC},
   doi={10.5220/0010827000003116},
   isbn={978-989-758-547-0},
   issn={2184-433X},
}

License

Attribution-ShareAlike? 3.0 Unported (CC BY-SA 3.0)