}}}
Welcome to !NoSketch Engine, an open-source project combining Manatee and Bonito and Crystal into a powerful and free corpus management system. !NoSketch Engine is a limited version of the software empowering the famous [https://www.sketchengine.eu/ Sketch Engine] service, a commercial variant offering word sketches, thesaurus, keyword computation, user-friendly corpus creation and many other excellent features.
Try [https://auth.sketchengine.eu/#register Sketch Engine trial account] - word sketches, thesaurus, keywords, online corpus building and space for your corpora, online availability and technical support. See overview of [https://www.sketchengine.eu/nosketch-engine/ Sketch Engine versus NoSketch Engine].
= News =
For receiving updates about new versions and futures, please subscribe to the [https://groups.google.com/a/sketchengine.co.uk/forum/#!forum/noske NoSketch Engine Google group].
= Documentation =
You are free to use the [https://www.sketchengine.eu/documentation/ documentation available for commercial Sketch Engine].
= System requirements =
!NoSketch Engine packages are available for a Linux distribution CentOS 7 64 bit. The requirements: 8 GB of RAM; 20 GB of space, SSD is strongly preferred; (CPU): contemporary 64-bit Intel or AMD processor.
= !NoSketch Engine packages =
== manatee ==
Manatee is a corpus management and query system. License: GPLv2+.
== bonito ==
Bonito is an API interface for the Manatee corpus management system. License: GPLv2+.
== gdex ==
GDEX (Good Dictionary Examples) is a Bonito module for sorting concordances according to their suitability as dictionary examples. License: GPLv3.
== crystal ==
Crystal is a web interface for Sketch Engine. License: GPLv3.
== third party dependencies ==
- You may install [https://github.com/seveas/python-prctl python-prctl] and Bonito will use it to setup nice process titles for background jobs (that you will see in the output of the {{{ps}}} command and similar).\\\\Bonito RPM package requires python3-prctl, you can ignore the dependency ({{{rpm -ivh ./bonito-open-*.el7.noarch.rpm --nodeps}}}) or you can build python3-prctl RPM package from python-prctl sources ({{{git clone https://github.com/seveas/python-prctl.git ; cd python-prctl/ ; sed -i 's|name = "python-prctl"|name = "python3-prctl"|' setup.py ; ./setup.py bdist_rpm}}}).
- You may install [https://foss.heptapod.net/openpyxl/openpyxl openpyxl] and Bonito will use it to export into Office Open XML format (xlsx).\\\\Bonito RPM package requires python3-openpyxl, you can ignore the dependency ({{{rpm -ivh ./bonito-open-*.el7.noarch.rpm --nodeps}}}) or you can build python3-openpyxl RPM package from openpyxl sources ({{{hg clone https://foss.heptapod.net/openpyxl/openpyxl/ ; cd openpyxl ; sed -i "s|name='openpyxl'|name='python3-openpyxl'|" setup.py ; sed -i "1c#\!/usr/bin/python3" setup.py ; ./setup.py bdist_rpm}}}).
= Downloads =
== Latest stable release ==
'''You should always download the latest versions of all components.
|| ||= '''manatee-open''' =||= '''bonito-open''' =||= '''gdex''' =||= '''crystal-open''' =||= '''sample corpus''' =||
|| tar.gz || [https://corpora.fi.muni.cz/noske/current/src/manatee-open-2.208.tar.gz manatee-open-2.208.tar.gz] || [https://corpora.fi.muni.cz/noske/current/src/bonito-open-5.57.6.tar.gz bonito-open-5.57.6.tar.gz] || [https://corpora.fi.muni.cz/noske/current/src/gdex-4.12.tar.gz gdex-4.12.tar.gz] || [https://corpora.fi.muni.cz/noske/current/src/crystal-open-2.114.tar.gz crystal-open-2.114.tar.gz] || [https://corpora.fi.muni.cz/noske/current/src/susanne-example-source.tar.bz2 susanne-example-source.tar.bz2] ||
|| rpm (Centos 7) || [https://corpora.fi.muni.cz/noske/current/centos7/manatee-open/ 2.208] || [https://corpora.fi.muni.cz/noske/current/centos7/bonito-open/ 5.57.6] || [https://corpora.fi.muni.cz/noske/current/centos7/gdex/ 4.12] || [https://corpora.fi.muni.cz/noske/current/centos7/crystal-open/ 2.114] || [https://corpora.fi.muni.cz/noske/current/centos7/manatee-open/ 2.208] ||
== Release notes ==
[[span(style=color: #FF0000, IMPORTANT)]]: When updating from a version of Manatee before 2.207.2, existing corpora need to be recompiled or updated in-place by executing
{{{
corpus4fsa CORPUS_CONFIG_FILE
}}}
for every corpus. If you don't do this, the process will be automatically started as a background job when new indices are needed, making users to wait (up to several minutes for big corpora) until it finishes.
== Older releases ==
Older releases can be downloaded from the [https://corpora.fi.muni.cz/noske/archive archive].
= Build and installation =
== manatee ==
{{{
tar xzvf manatee-open-.tar.gz
cd manatee-open-
./configure --with-pcre
make
sudo make install
}}}
== bonito ==
{{{
tar xzvf bonito-open-.tar.gz
cd bonito-open-
./configure
make
sudo make install
sudo ./setupbonito
# where CGIPATH is the your webserver CGI directory and DATAPATH is a data directory writable by the webserver
}}}
== gdex ==
{{{
tar xzvf gdex-.tar.gz
cd gdex-
VERSION=
sed -i "s//$VERSION/g" setup.py
./setup.py build
sudo ./setup.py install
}}}
== crystal ==
{{{
tar xzvf crystal-open-.tar.gz
cd crsytal-open-
make
sudo make install VERSION=
}}}
== Installation from RPM packages ==
!NoSketch Engine packages
{{{
rpm -ivh crystal-open-*.el7.noarch bonito-open-*.el7.noarch manatee-open-*.el7.x86_64 manatee-open-python3-*.el7.x86_64
}}}
sample corpora
{{{
rpm -ivh manatee-open-susanne-*.el7.noarch
}}}
= Configuration =
== Apache (httpd) configuration without authentication ==
{{{
Alias /crystal /var/www/crystal
Alias /bonito /var/www/bonito
AllowOverride All
Options +ExecCGI -Indexes
AddHandler cgi-script .cgi
}}}
== Apache (httpd) configuration with authentication ==
{{{
Alias /crystal-auth /var/www/crystal
Alias /bonito-auth /var/www/bonito
AllowOverride All
Options +ExecCGI -Indexes
AddHandler cgi-script .cgi
AuthType Basic
AuthName "Secure Content"
AuthUserFile /var/lib/bonito/htpasswd
Require valid-user
}}}
== Apache (httpd) configuration with authentication and self-registration ==
{{{
Alias /crystal-registration /var/www/crystal
Alias /bonito-registration /var/www/bonito
AllowOverride All
Options +ExecCGI -Indexes
AddHandler cgi-script .cgi
AuthType Basic
AuthName "Secure Content"
AuthUserFile /var/lib/bonito/htpasswd
Require expr %{REQUEST_URI} =~ m#^/bonito-registration/registration.cgi/register_user_new.*#
Require valid-user
}}}
== Bonito (run.cgi) configuration ==
Bonito configuration file is `run.cgi` you may run multiple instances just by copying this file and changing the configuration.
To enable authentication set `_anonymous = False`.
If you run Crystal on a different hostname than Bonito, you may need to setup CORS headers appropriately, see the top of the `run.cgi` file.
Bonito provides simple registration feature wich has configuration in `registration.cgi`. The regitration is disabled by default. To enable registration add `URL_REGISTER_NEW_USER` endpoint in Crystal (`config.js`) and set `self._enable_registration = True` in `registration.cgi` for self-registration. The regitration module allows also approve/deny access requests via e-mail by admins specified in `/var/lib/bonito/registration/admins`.
== Crystal (config.js) configuration ==
{{{
# set URL to run.cgi script of bonito
URL_BONITO: "https://no.sketchengine.co.uk/bonito/run.cgi/",
# URL of endpoint for registering new users (e.g bonito/registration.cgi). Leave empty to disable registration.
URL_REGISTER_NEW_USER: "https://no.sketchengine.co.uk/bonito-registration/registration.cgi/register_user_new",
}}}
= Credits =
Finlib, Manatee and Bonito have been crafted by [https://www.fi.muni.cz/~pary/ Pavel RychlĂ˝], starting with his [https://www.fi.muni.cz/~pary/dis.pdf PhD thesis]. Sketch Engine is product of [https://www.lexicalcomputing.com Lexical Computing]. When using !NoSketch Engine for research purposes, please cite the following two publications:
''' ''RYCHLĂ, Pavel. !Manatee/Bonito-A Modular Corpus Manager. In: RASLAN. 2007. p. 65-70.'' '''
{{{
@inproceedings{rychly2007manatee,
title={Manatee/Bonito-A Modular Corpus Manager.},
author={Rychl{\`y}, Pavel},
booktitle={RASLAN},
pages={65--70},
year={2007}
}
}}}
''' ''KILGARRIFF, Adam, et al. The Sketch Engine: Ten Years on. Lexicography, 2014, 1.1: 7-36.'' '''
{{{
@article{kilgarriff2014sketch,
title={The Sketch Engine: ten years on},
author={Kilgarriff, Adam and Baisa, V{\'\i}t and Bu{\v{s}}ta, Jan and Jakub{\'\i}{\v{c}}ek, Milo{\v{s}} and Kov{\'a}{\v{r}}, Vojt{\v{e}}ch and Michelfeit, Jan and Rychl{\`y}, Pavel and Suchomel, V{\'\i}t},
journal={Lexicography},
volume={1},
number={1},
pages={7--36},
year={2014},
publisher={Springer}
}
}}}
When using the GDEX module, please cite:
''' ''KOSEM, Iztok, et al. Identification and automatic extraction of good dictionary examples: the case (s) of GDEX. International Journal of Lexicography, 2019, 32.2: 119-137.'' '''
{{{
@article{kosem2019identification,
title={Identification and automatic extraction of good dictionary examples: the case (s) of GDEX},
author={Kosem, Iztok and Koppel, Kristina and Zingano Kuhn, Tanara and Michelfeit, Jan and Tiberius, Carole},
journal={International Journal of Lexicography},
volume={32},
number={2},
pages={119--137},
year={2019},
publisher={Oxford University Press}
}
}}}
For a list of related publications, please refer to the Sketch Engine [https://www.sketchengine.eu/bibliography-of-sketch-engine/ publications page].
= Testing installation =
!NoSketch Engine plain installation:
* https://no.sketchengine.co.uk/crystal
!NoSketch Engine with enabled authentication (test/t):
* https://no.sketchengine.co.uk/crystal-auth
!NoSketch Engine with enabled authentication and registration:
* https://no.sketchengine.co.uk/crystal-registration
= (No)Sketch Engine installations over the world =
{{{
#!html
}}}