DiscourseData

class polyglotdb.io.discoursedata.DiscourseData(name, annotation_types, hierarchy)[source]

Class for collecting information about a discourse to be loaded

Parameters:
name : str

Identifier for the discourse

annotation_types : list

List of BaseAnnotationType objects

hierarchy : Hierarchy

Details of how linguistic types relate to one another

Attributes:
name : str

Identifier for the discourse

data : dict

Dictionary containing BaseAnnotationType objects indexed by their name

segment_type : str or None

Identifier of the segment linguistic annotation, if it exists

wav_path : str or None

Path to sound file if it exists

Methods

__init__(name, annotation_types, hierarchy) Initialize self.
highest_to_lowest() orders hierarchy highest to lowest
items() Returns tuple of items in corpus
keys() Returns corpus keys
types(corpus_name) Get all the types in the discourse and return them along with header information
values() Returns tuple of values in corpus
annotation_types

Returns corpus annotation types

highest_to_lowest()[source]

orders hierarchy highest to lowest

Returns:
ats : dict

the ordered hierarchy

items()[source]

Returns tuple of items in corpus

keys()[source]

Returns corpus keys

speakers

Returns speakers from a discourse

token_headers

Get the headers for the CSV file for importing annotation tokens

Returns:
list

Token headers

types(corpus_name)[source]

Get all the types in the discourse and return them along with header information

Parameters:
corpus_name : str

the name of the corpus

Returns:
dict

Type data

list

Type headers

values()[source]

Returns tuple of values in corpus