{{{ #!html

NoSketch Engine

No Sketch Engine logo }}} Welcome to !NoSketch Engine, an open-source project combining Manatee and Bonito and Crystal into a powerful and free corpus management system. !NoSketch Engine is a limited version of the software empowering the famous [https://www.sketchengine.eu/ Sketch Engine] service, a commercial variant offering word sketches, thesaurus, keyword computation, user-friendly corpus creation and many other excellent features. Try [https://auth.sketchengine.eu/#register Sketch Engine trial account] - word sketches, thesaurus, keywords, online corpus building and space for your corpora, online availability and technical support. See overview of [https://www.sketchengine.eu/nosketch-engine/ Sketch Engine versus NoSketch Engine]. = News = For receiving updates about new versions and futures, please subscribe to the [https://groups.google.com/a/sketchengine.co.uk/forum/#!forum/noske NoSketch Engine Google group]. = Documentation = You are free to use the [https://www.sketchengine.eu/documentation/ documentation available for commercial Sketch Engine]. = System requirements = !NoSketch Engine packages are available for a Linux distribution CentOS 7 64 bit. The requirements: 8 GB of RAM; 20 GB of space, SSD is strongly preferred; (CPU): contemporary 64-bit Intel or AMD processor. = !NoSketch Engine packages = == manatee == Manatee is a corpus management and query system. License: GPLv2+. == bonito == Bonito is an API interface for the Manatee corpus management system. License: GPLv2+. == gdex == GDEX (Good Dictionary Examples) is a Bonito module for sorting concordances according to their suitability as dictionary examples. License: GPLv3. == crystal == Crystal is a web interface for Sketch Engine. License: GPLv3. == third party dependencies == - You may install [https://github.com/seveas/python-prctl python-prctl] and Bonito will use it to setup nice process titles for background jobs (that you will see in the output of the {{{ps}}} command and similar).\\\\Bonito RPM package requires python3-prctl, you can ignore the dependency ({{{rpm -ivh ./bonito-open-*.el7.noarch.rpm --nodeps}}}) or you can build python3-prctl RPM package from python-prctl sources ({{{git clone https://github.com/seveas/python-prctl.git ; cd python-prctl/ ; sed -i 's|name = "python-prctl"|name = "python3-prctl"|' setup.py ; ./setup.py bdist_rpm}}}). - You may install [https://foss.heptapod.net/openpyxl/openpyxl openpyxl] and Bonito will use it to export into Office Open XML format (xlsx).\\\\Bonito RPM package requires python3-openpyxl, you can ignore the dependency ({{{rpm -ivh ./bonito-open-*.el7.noarch.rpm --nodeps}}}) or you can build python3-openpyxl RPM package from openpyxl sources ({{{hg clone https://foss.heptapod.net/openpyxl/openpyxl/ ; cd openpyxl ; sed -i "s|name='openpyxl'|name='python3-openpyxl'|" setup.py ; sed -i "1c#\!/usr/bin/python3" setup.py ; ./setup.py bdist_rpm}}}). = Downloads = == Latest stable release == '''You should always download the latest versions of all components. || ||= '''manatee-open''' =||= '''bonito-open''' =||= '''gdex''' =||= '''crystal-open''' =||= '''sample corpus''' =|| || tar.gz || [https://corpora.fi.muni.cz/noske/current/src/manatee-open-2.214.1.tar.gz manatee-open-2.214.1.tar.gz] || [https://corpora.fi.muni.cz/noske/current/src/bonito-open-5.58.1.tar.gz bonito-open-5.58.1.tar.gz] || [https://corpora.fi.muni.cz/noske/current/src/gdex-4.12.tar.gz gdex-4.12.tar.gz] || [https://corpora.fi.muni.cz/noske/current/src/crystal-open-2.129.tar.gz crystal-open-2.129.tar.gz] || [https://corpora.fi.muni.cz/noske/current/src/susanne-example-source.tar.bz2 susanne-example-source.tar.bz2] || || rpm (Centos 7) || [https://corpora.fi.muni.cz/noske/current/centos7/manatee-open/ 2.214.1] || [https://corpora.fi.muni.cz/noske/current/centos7/bonito-open/ 5.58.1] || [https://corpora.fi.muni.cz/noske/current/centos7/gdex/ 4.12] || [https://corpora.fi.muni.cz/noske/current/centos7/crystal-open/ 2.129] || [https://corpora.fi.muni.cz/noske/current/centos7/manatee-open/ 2.214.1] || == Release notes == [[span(style=color: #FF0000, IMPORTANT)]]: When updating from a version of Manatee before 2.207.2, existing corpora need to be recompiled or updated in-place by executing {{{ corpus4fsa CORPUS_CONFIG_FILE }}} for every corpus. If you don't do this, the process will be automatically started as a background job when new indices are needed, making users to wait (up to several minutes for big corpora) until it finishes. == Older releases == Older releases can be downloaded from the [https://corpora.fi.muni.cz/noske/archive archive]. = Build and installation = == manatee == {{{ tar xzvf manatee-open-.tar.gz cd manatee-open- ./configure --with-pcre make sudo make install }}} == bonito == {{{ tar xzvf bonito-open-.tar.gz cd bonito-open- ./configure make sudo make install sudo ./setupbonito # where CGIPATH is the your webserver CGI directory and DATAPATH is a data directory writable by the webserver }}} == gdex == {{{ tar xzvf gdex-.tar.gz cd gdex- VERSION= sed -i "s//$VERSION/g" setup.py ./setup.py build sudo ./setup.py install }}} == crystal == {{{ tar xzvf crystal-open-.tar.gz cd crsytal-open- make sudo make install VERSION= }}} == Installation from RPM packages == !NoSketch Engine packages {{{ rpm -ivh crystal-open-*.el7.noarch bonito-open-*.el7.noarch manatee-open-*.el7.x86_64 manatee-open-python3-*.el7.x86_64 }}} sample corpora {{{ rpm -ivh manatee-open-susanne-*.el7.noarch }}} = Configuration = == Apache (httpd) configuration without authentication == {{{ Alias /crystal /var/www/crystal Alias /bonito /var/www/bonito AllowOverride All Options +ExecCGI -Indexes AddHandler cgi-script .cgi }}} == Apache (httpd) configuration with authentication == {{{ Alias /crystal-auth /var/www/crystal Alias /bonito-auth /var/www/bonito AllowOverride All Options +ExecCGI -Indexes AddHandler cgi-script .cgi AuthType Basic AuthName "Secure Content" AuthUserFile /var/lib/bonito/htpasswd Require valid-user }}} == Apache (httpd) configuration with authentication and self-registration == {{{ Alias /crystal-registration /var/www/crystal Alias /bonito-registration /var/www/bonito AllowOverride All Options +ExecCGI -Indexes AddHandler cgi-script .cgi AuthType Basic AuthName "Secure Content" AuthUserFile /var/lib/bonito/htpasswd Require expr %{REQUEST_URI} =~ m#^/bonito-registration/registration.cgi/register_user_new.*# Require valid-user }}} == Bonito (run.cgi) configuration == Bonito configuration file is `run.cgi` you may run multiple instances just by copying this file and changing the configuration. To enable authentication set `_anonymous = False`. If you run Crystal on a different hostname than Bonito, you may need to setup CORS headers appropriately, see the top of the `run.cgi` file. Bonito provides simple **registration feature** which configuration is in `registration.cgi`. The registration module works in three different modes: * Disabled – no registration is allowed – **default** * Self registration – user can make a registration and enter into the system * add `URL_REGISTER_NEW_USER` endpoint into Crystal (`config.js`) * set `self._enable_registration = True` inside `registration.cgi` * Registration with approval – after user make registration an access request e-mail is sent to admins that have to allow or deny the user access * to turn this feature on, change value of variable `self._enable_mail = True` inside `registration.cgi` * set variables `self._smtp_server` and `self._from_mail` in `registration.cgi` appropriately * you may change the subject and content of e-mails in `registration.cgi` as well Files related to registration: * `/var/lib/ske/htpasswd` – standard .htpasswd file [https://httpd.apache.org/docs/2.4/misc/password_encryptions.html documentation] * `/var/lib/ske/registration/admins` – list of all admin users that can allow or deny new registration – `` per line * `/var/lib/ske/registration/users` – list of all users (including admins, approved and denied users) – `\t\t\t
\t\t` tab-separated values * `/var/lib/ske/registration/invalid_users` – list of denied users – `` per line == Crystal (config.js) configuration == {{{ # set URL to run.cgi script of bonito URL_BONITO: "https://no.sketchengine.eu/bonito/run.cgi/", # URL of endpoint for registering new users (e.g bonito/registration.cgi). Leave empty to disable registration. URL_REGISTER_NEW_USER: "https://no.sketchengine.eu/bonito-registration/registration.cgi/register_user_new", }}} = Credits = Finlib, Manatee and Bonito have been crafted by [https://www.fi.muni.cz/~pary/ Pavel Rychlý], starting with his [https://www.fi.muni.cz/~pary/dis.pdf PhD thesis]. Sketch Engine is product of [https://www.lexicalcomputing.com Lexical Computing]. When using !NoSketch Engine for research purposes, please cite the following two publications: ''' ''RYCHLÝ, Pavel. !Manatee/Bonito-A Modular Corpus Manager. In: RASLAN. 2007. p. 65-70.'' ''' {{{ @inproceedings{rychly2007manatee, title={Manatee/Bonito-A Modular Corpus Manager.}, author={Rychl{\`y}, Pavel}, booktitle={RASLAN}, pages={65--70}, year={2007} } }}} ''' ''KILGARRIFF, Adam, et al. The Sketch Engine: Ten Years on. Lexicography, 2014, 1.1: 7-36.'' ''' {{{ @article{kilgarriff2014sketch, title={The Sketch Engine: ten years on}, author={Kilgarriff, Adam and Baisa, V{\'\i}t and Bu{\v{s}}ta, Jan and Jakub{\'\i}{\v{c}}ek, Milo{\v{s}} and Kov{\'a}{\v{r}}, Vojt{\v{e}}ch and Michelfeit, Jan and Rychl{\`y}, Pavel and Suchomel, V{\'\i}t}, journal={Lexicography}, volume={1}, number={1}, pages={7--36}, year={2014}, publisher={Springer} } }}} When using the GDEX module, please cite: ''' ''KOSEM, Iztok, et al. Identification and automatic extraction of good dictionary examples: the case (s) of GDEX. International Journal of Lexicography, 2019, 32.2: 119-137.'' ''' {{{ @article{kosem2019identification, title={Identification and automatic extraction of good dictionary examples: the case (s) of GDEX}, author={Kosem, Iztok and Koppel, Kristina and Zingano Kuhn, Tanara and Michelfeit, Jan and Tiberius, Carole}, journal={International Journal of Lexicography}, volume={32}, number={2}, pages={119--137}, year={2019}, publisher={Oxford University Press} } }}} For a list of related publications, please refer to the Sketch Engine [https://www.sketchengine.eu/bibliography-of-sketch-engine/ publications page]. = Testing installation = !NoSketch Engine plain installation: * https://no.sketchengine.eu/crystal !NoSketch Engine with enabled authentication (test/t): * https://no.sketchengine.eu/crystal-auth !NoSketch Engine with enabled authentication and registration: * https://no.sketchengine.eu/crystal-registration = (No)Sketch Engine installations over the world = {{{ #!html }}}