bp_text.pool
This module implements functionality for a (text) pool.
A text pool is a collection of annotated/tokenized, text-holding objects (e.g. PdfFiles, TxtFiles) and can be generated from a BibTexDatabase object. Its main purpose is to facilitate interacting with a corpus of texts and the metadata provided by the BibTex entries.
Created: 2025-04-25 Author: Ruben Philipp <me@rubenphilipp.com>
$$ Last modified: 22:36:31 Mon Apr 28 2025 CEST
Functions
|
This data getter function returns the next data item index and also sets the next item index to return. |
|
This data getter function returns a random data item index. |
Classes
|
Implementation of the Pool class. |
|
This class implements a PoolItem. |
- class bp_text.pool.Pool(data={})[source]
Bases:
objectImplementation of the Pool class.
A (text) Pool is a collection of annotated/tokenized, text-holding objects (e.g. PdfFiles, TxtFiles) and can be generated from a BibTexDatabase object. Its main purpose is to facilitate interacting with a corpus of texts and the metadata provided by the BibTex entries.
The data of the pool is a dict of
PoolItemobjects. The keys of the dict are typically (e.g. when the Pool is created from aBibTexDatabase) citation keys.- Parameters:
data (dict) – A dict with an initial set of
PoolItemobjects.
- class bp_text.pool.PoolItem(key, meta={}, data=[], default_get_data_func=None)[source]
Bases:
objectThis class implements a PoolItem.
PoolItems are containers for metadata (e.g. retrieved from a BibTeX entry in a BibTexDatabase) and text holding objects (in the data attr), e.g.
PdfFileobjects.They are meant to be placed into a
Pool.- Parameters:
key (string) – A (unique) key. This is most likely a BibTeX citekey.
meta (dict) – A dict holding metadata, most likely derived from a BibTeX entry.
data (list) – A list holding one or more text holding objects (e.g. a
PdfFile).default_get_data_func (Either an integer or a function which must be a function taking the PoolItem as its argument and must return an index to the element of data which should be retrieved. Set to None to use the default.) – The default function to retrieve a data object via
get_data()(cf.get_data()). This could also be an integer which is an index to an element in the data attribute of the PoolItem. Default = None, which falls back to the default which causes get_data to always return the first element of the data.
- property data
Getter/setter for the data list.
- property default_get_data_func
- get_data(index=0)[source]
This function returns a single data object from the data list instead of the data list itself. The index argument – which must be a function taking the PoolItem as its argument and must return an index to the element of data which should be retrieved – specifies which element should be returned. By default, it always returns the first item of the data list. There are two more functions specified, which could also be used:
random_data()andcycle_data().Example:
# this is an example using a function instead of an integer. the # function cycles through the data by using the pre-defined # `cycle_data` function. # NB: `pitm` here is a `PoolItem` object pitm.get_data(cycle_data)
- Parameters:
index (int or function) – Either an integer which is a (zero-based) index to an element in data, or a function which must be a function taking the PoolItem as its argument and must return an index to the element of data which should be retrieved. Default = 0.
- property key
Getter/setter for the key.
- property meta
Getter/setter for the meta dict.
- property next_data
Getter/setter for the next_data id. This is an index (zero-based) to the next element that should be retrieved from the data list when using
get_data(). (int)
- bp_text.pool.cycle_data(ob)[source]
This data getter function returns the next data item index and also sets the next item index to return. See
PoolItem.get_data()for details.
- bp_text.pool.random_data(ob)[source]
This data getter function returns a random data item index. See
PoolItem.get_data()for details.