Common#

Multi-purpose module of things that can be used in more than one other module.

class gismo.common.MixInIO[source]#

Provide basic save/load capacities to other classes.

dump(filename: str, path='.', overwrite=False, compress=True)[source]#

Save instance to file.

Parameters:
  • filename (str) – The stem of the filename.

  • path (str or Path, optional) – The location path.

  • overwrite (bool) – Should existing file be erased if it exists?

  • compress (bool) – Should gzip compression be used?

Examples

>>> import tempfile
>>> v1 = ToyClass(42)
>>> v2 = ToyClass()
>>> v2.value
0
>>> with tempfile.TemporaryDirectory() as tmpdirname:
...     v1.dump(filename='myfile', compress=True, path=tmpdirname)
...     dir_content = [file.name for file in Path(tmpdirname).glob('*')]
...     v2 = ToyClass.load(filename='myfile', path=Path(tmpdirname))
...     v1.dump(filename='myfile', compress=True, path=tmpdirname) # doctest.ELLIPSIS
File ...myfile.pkl.gz already exists! Use overwrite option to overwrite.
>>> dir_content
['myfile.pkl.gz']
>>> v2.value
42
>>> with tempfile.TemporaryDirectory() as tmpdirname:
...     v1.dump(filename='myfile', compress=False, path=tmpdirname)
...     v1.dump(filename='myfile', compress=False, path=tmpdirname) # doctest.ELLIPSIS
File ...myfile.pkl already exists! Use overwrite option to overwrite.
>>> v1.value = 51
>>> with tempfile.TemporaryDirectory() as tmpdirname:
...     v1.dump(filename='myfile', path=tmpdirname, compress=False)
...     v1.dump(filename='myfile', path=tmpdirname, overwrite=True, compress=False)
...     v2 = ToyClass.load(filename='myfile', path=tmpdirname)
...     dir_content = [file.name for file in Path(tmpdirname).glob('*')]
>>> dir_content
['myfile.pkl']
>>> v2.value
51
>>> with tempfile.TemporaryDirectory() as tmpdirname:
...    v2 = ToyClass.load(filename='thisfilenamedoesnotexist')
Traceback (most recent call last):
 ...
FileNotFoundError: [Errno 2] No such file or directory: ...
classmethod load(filename: str, path='.')[source]#

Load instance from file.

Parameters:
  • filename (str) – The stem of the filename.

  • path (str or Path, optional) – The location path.

class gismo.common.ToyClass(value=0)[source]#
gismo.common.auto_k(data, order=None, max_k=100, target=1.0)[source]#

Proposes a threshold k of significant values according to a relevance vector.

Parameters:
  • data (ndarray) – Vector with positive relevance values.

  • order (list of int, optional) – Ordered indices of data

  • max_k (int) – Maximal number of entries to return; also number of entries used to determine threshold.

  • target (float) – Threshold modulation. Higher target means less result. A target set to 1.0 corresponds to using the average of the max_k top values as threshold.

Returns:

k – Recommended number of values.

Return type:

int

Example

>>> data = np.array([30, 1, 2, .3, 4, 50, 80])
>>> auto_k(data)
3
gismo.common.toy_source_dict = [{'content': 'Gizmo is a Mogwaï.', 'title': 'First Document'}, {'content': 'This is a sentence about Blade.', 'title': 'Second Document'}, {'content': 'This is another sentence about Shadoks.', 'title': 'Third Document'}, {'content': 'This very long sentence, with a lot of stuff about Star Wars inside, makes at some point a side reference to the Gremlins movie by comparing Gizmo and Yoda.', 'title': 'Fourth Document'}, {'content': 'In chinese folklore, a Mogwaï is a demon.', 'title': 'Fifth Document'}]#

A minimal source example where items are dict with keys title and content.

gismo.common.toy_source_text = ['Gizmo is a Mogwaï.', 'This is a sentence about Blade.', 'This is another sentence about Shadoks.', 'This very long sentence, with a lot of stuff about Star Wars inside, makes at some point a side reference to the Gremlins movie by comparing Gizmo and Yoda.', 'In chinese folklore, a Mogwaï is a demon.']#

A minimal source example where items are str.