Getting started

PolyglotDB is the Python API for interacting with Polyglot databases and is installed through pip. There are other dependencies that must be installed prior to using a Polyglot database, depending on the user’s platform.

Note

Another way to use Polyglot functionality is through setting up an ISCAN server. An Integrated Speech Corpus Analysis (ISCAN) server can be set up on a lab’s central server, or you can run it on your local computer as well (though many of PolyglotDB’s algorithms benefit from having more processors and memory available). Please see the ISCAN documentation for more information on setting it up (http://iscan.readthedocs.io/en/latest/getting_started.html). The main feature benefits of ISCAN are multiple Polyglot databases (separating out different corpora and allowing any of them to be started or shutdown), graphical interfaces for inspecting data, and a user authentication system with different levels of permission for remote access through a web application.

Installation

To install via pip:

pip install polyglotdb

To install from source (primarily for development):

  1. Clone or download the Git repository (https://github.com/MontrealCorpusTools/PolyglotDB).

  2. Navigate to the directory via command line and install the dependencies via pip install -r requirements.txt

  3. Install PolyglotDB via python setup.py install, which will install the pgdb utility that can be run anywhere and manages a local database.

Note

The use of sudo is not recommended for installation. Ideally your Python installation should be managed by either Anaconda or Homebrew (for Macs).

Set up local database

Installing the PolyglotDB package also installs a utility script (pgdb) that is then callable from the command line anywhere on the system. The pgdb command allows for the administration of a single Polyglot database (install/start/stop/uninstall). Using pgdb requires that several prerequisites be installed first, and the remainder of this section will detail how to install these on various platforms. Please be aware that using the pgdb utility to set up a database is not recommended for larger groups or those needing remote access. See the ISCAN server for a more fully featured solution.

Mac

  1. Ensure Java 11 is installed inside Anaconda distribution (conda install -c anaconda openjdk) if using Anaconda, or via Homebrew otherwise (brew cask install java)

  2. Check Java version is 11 via java --version

  3. Once PolyglotDB is installed, run pgdb install /path/to/where/you/want/data/to/be/stored, or pgdb install to save data in the default directory.

Warning

Do not use sudo with this command on Macs, as it will lead to permissions issues later on.

Once you have installed PolyglotDB, to start it run pgdb start. Likewise, you can close PolyglotDB by running pgdb stop.

To uninstall, run pgdb uninstall

Windows

  1. Ensure Java 11 is installed (https://www.java.com/) and on the path (java --version works in the command prompt)

  2. Check Java version is 11 via java --version

  3. Start an Administrator command prompt (right click on cmd.exe and select “Run as administrator”), as Neo4j will be installed as a Windows service.

  4. Run pgdb install /path/to/where/you/want/data/to/be/stored, or pgdb install to save data in the default directory.

To start the database, you likewise have to use an administrator command prompt before entering the commands pgdb start or pgdb stop.

To uninstall, run pgdb uninstall (also requires an administrator command prompt).

Linux

Ensure Java 11 is installed. On Ubuntu:

sudo apt-get update
sudo apt-get install openjdk-11-jdk-headless

Once installed, double check that java --version returns Java 11. Then run pgdb install /path/to/where/you/want/data/to/be/stored, or pgdb install to save data in the default directory.

Once you have installed PolyglotDB, to start it run pgdb start. Likewise, you can close PolyglotDB by running pgdb stop.

To uninstall, navigate to the PolyglotDB directory and type pgdb uninstall

Conda Development Environment

You can also use conda to create an isolated polyglotdb development environment:

  1. Install either Anaconda, miniconda, or miniforge

  2. Make sure your conda is up to date (conda update conda)

  3. Make sure that you have cloned the polyglotdb repository

  4. cd to the source directory

We’ll now kick off a two-step process:

  1. Create a conda environment to isolate polyglotdb dependencies

  2. Build and install polyglotdb

conda env create -f environment.yml
conda activate polyglotdb-dev

From within the polyglotdb-dev environment, double check that java --version returns Java 11. Then:

python setup.py install
pgdb install /path/to/where/you/want/data/to/be/stored

You now have a new conda environment that is isolated from any existing environments or python installations. The pgdb executable will only be accessible from within the polyglotdb-dev environment.

To view your environments:

conda info -e

To return to your root environment:

conda deactivate