Corpus API¶
Corpus classes¶
Base corpus¶
- class polyglotdb.corpus.BaseContext(*args, **kwargs)[source]¶
Base CorpusContext class. Inherit from this and extend to create more functionality.
- Parameters
- *args
If the first argument is not a
CorpusConfig
object, it is the name of the corpus- **kwargs
If a
CorpusConfig
object is not specified, all arguments and keyword arguments are passed to a CorpusConfig object
Phonological functionality¶
Syllabic functionality¶
Lexical functionality¶
Pause functionality¶
Utterance functionality¶
Audio functionality¶
Summarization functionality¶
Spoken functionality¶
Structured functionality¶
Annotation functionality¶
Omnibus class¶
- class polyglotdb.corpus.CorpusContext(*args, **kwargs)[source]¶
Main corpus context, inherits from the more specialized contexts.
- Parameters
- argsargs
Either a CorpusConfig object or sequence of arguments to be passed to a CorpusConfig object
- kwargskwargs
sequence of keyword arguments to be passed to a CorpusConfig object
Corpus structure class¶
- class polyglotdb.structure.Hierarchy(data=None, corpus_name=None)[source]¶
Class containing information about how a corpus is structured.
Hierarchical data is stored in the form of a dictionary with keys for linguistic types, and values for the linguistic type that contains them. If no other type contains a given type, its value is
None
.Subannotation data is stored in the form of a dictionary with keys for linguistic types, and values of sets of types of subannotations.
- Parameters
- datadict
Information about the hierarchy of linguistic types
- corpus_namestr
Name of the corpus
Corpus config class¶
- class polyglotdb.config.CorpusConfig(corpus_name, data_dir=None, **kwargs)[source]¶
Class for storing configuration information about a corpus.
- Parameters
- corpus_namestr
Identifier for the corpus
- kwargskeyword arguments
All keywords will be converted to attributes of the object
- Attributes
- corpus_namestr
Identifier of the corpus
- graph_userstr
Username for connecting to the graph database
- graph_passwordstr
Password for connecting to the graph database
- graph_hoststr
Host for the graph database
- graph_portint
Port for connecting to the graph database
- enginestr
Type of SQL database
- base_dirstr
Base directory to store information and temporary files for the corpus defaults to “.pgdb” under the current user’s home directory