wiki:WikiStart

Version 3 (modified by Miloš Jakubíček, 3 months ago) (diff)

--

NoSketch Engine

Welcome to NoSketch Engine, an open-source project combining Manatee and Bonito and Crystal into a powerful and free corpus management system. NoSketch Engine is a limited version of the software empowering the famous Sketch Engine service, a commercial variant offering word sketches, thesaurus, keyword computation, user-friendly corpus creation and many other excellent features.

Try Sketch Engine trial account - word sketches, thesaurus, keywords, online corpus building and space for your corpora, online availability and technical support. See overview of Sketch Engine versus NoSketch Engine.

News

For receiving updates about new versions and futures, please subscribe to the NoSketch Engine Google group.

Documentation

You are free to use the documentation available for commercial Sketch Engine.

NoSketch Engine packages

manatee

Manatee is a corpus management and query system. License: GPLv2+.

bonito

Bonito is an API interface for the Manatee corpus management system. License: GPLv2+.

gdex

GDEX (Good Dictionary Examples) is a Bonito module for sorting concordances according to their suitability as dictionary examples. License: GPLv3.

crystal

Crystal is a web interface for Sketch Engine. License: GPLv3.

third party packages

Bonito wants python-prctl and python-signalfd (required) to be installed.

Downloads

Latest stable release

You should always download the latest versions of all components.

manatee-open bonito-open gdex crystal-open sample corpus
tar.gz manatee-open-2.167.10.tar.gz bonito-open-4.24.6.tar.gz gdex-3.12.tar.gz crystal-open-2.14.tar.gz susanne-example-source.tar.bz2
rpm (Centos 7) 2.167.10 4.24.6 3.12 2.14 2.167.10

Older releases

Older releases can be downloaded from the archive.

Build and installation

manatee

tar xzvf manatee-open-<version>.tar.gz
cd manatee-open-<version>
./configure PYTHON=python2 --with-pcre
make
sudo make install
ldconfig

bonito

tar xzvf bonito-open-<version>.tar.gz
cd bonito-open-<version>
./configure
make
sudo make install
sudo ./setupbonito <CGIPATH> <DATAPATH> 
# where CGIPATH is the your webserver CGI directory and DATAPATH is a data directory writable by the webserver

gdex

tar xzvf gdex-<version>.tar.gz
cd gdex-<version>
sudo python2 setup.py install

crystal

tar xzvf crystal-open-<version>.tar.gz
cd crsytal-open-<version>
make
sudo make install VERSION=<version>

Installation from RPM packages

NoSketch Engine packages

rpm -ivh crystal-open-*.el7.noarch bonito4-open-*.el7.noarch manatee-open-*.el7.x86_64 manatee-open-python-*.el7.x86_64

sample corpora

rpm -ivh manatee-open-susanne-*.el7.noarch

Configuration

Apache (httpd) configuration without authentication

        Alias /crystal /var/www/crystal

        Alias /bonito /var/www/bonito

        <Directory /var/www/bonito>
                AllowOverride All
                Options +ExecCGI
                AddHandler cgi-script .cgi
        </Directory>

Apache (httpd) configuration with authentication

        Alias /crystal-auth /var/www/crystal-auth

        Alias /bonito-auth /var/www/bonito

        <Directory /var/www/bonito>
                AllowOverride All
                Options +ExecCGI
                AddHandler cgi-script .cgi
                <LimitExcept OPTIONS>
                        <Location "/crystal-auth">
                                AuthType Basic
                                AuthName "Secure Content"
                                AuthUserFile /var/lib/bonito/htpasswd
                                Require valid-user
                        </Location>
                        <Location "/bonito-auth">
                                AuthType Basic
                                AuthName "Secure Content"
                                AuthUserFile /var/lib/bonito/htpasswd
                                Require valid-user
                        </Location>
                </LimitExcept>
        </Directory>

Bonito (run.cgi) configuration

Bonito configuration file is run.cgi you may run multiple instances just by copying this file and changing the configuration. To enable authentication set _anonymous = False. If you run Crystal on a different hostname than Bonito, you may need to setup CORS headers appropriately, see the top of the run.cgi file.

Crystal (config.js) configuration

# set URL to run.cgi script of bonito
URL_BONITO: "https://no.sketchengine.co.uk/bonito/run.cgi/",

Credits

Finlib, Manatee and Bonito have been crafted by Pavel Rychlý, starting with his PhD thesis. All the components are still being developed in collaboration with the Sketch Engine development team. When using NoSketch Engine for research purposes, please cite:

Rychlý, Pavel. Manatee/Bonito - A Modular Corpus Manager. In 1st Workshop on Recent Advances in Slavonic Natural Language Processing. Brno : Masaryk University, 2007. p. 65-70. ISBN 978-80-210-4471-5.

@inproceedings{rychly2007manatee,
  title={Manatee/Bonito-A Modular Corpus Manager.},
  author={Rychl{\`y}, Pavel},
  booktitle={RASLAN},
  pages={65--70},
  year={2007}
}

For a list of related publications, please refer to the Sketch Engine publications page.

Testing installation

NoSketch Engine plain installation:

NoSketch Engine with enabled authentication (test/t):

(No)Sketch Engine installations over the world