Common#
Multi-purpose module of things that can be used in more than one other module.
- class gismo.common.MixInIO[source]#
Provide basic save/load capacities to other classes.
- dump(filename: str, path='.', overwrite=False, compress=True)[source]#
Save instance to file.
- Parameters:
Examples
>>> import tempfile >>> v1 = ToyClass(42) >>> v2 = ToyClass() >>> v2.value 0 >>> with tempfile.TemporaryDirectory() as tmpdirname: ... v1.dump(filename='myfile', compress=True, path=tmpdirname) ... dir_content = [file.name for file in Path(tmpdirname).glob('*')] ... v2 = ToyClass.load(filename='myfile', path=Path(tmpdirname)) ... v1.dump(filename='myfile', compress=True, path=tmpdirname) # doctest.ELLIPSIS File ...myfile.pkl.gz already exists! Use overwrite option to overwrite. >>> dir_content ['myfile.pkl.gz'] >>> v2.value 42
>>> with tempfile.TemporaryDirectory() as tmpdirname: ... v1.dump(filename='myfile', compress=False, path=tmpdirname) ... v1.dump(filename='myfile', compress=False, path=tmpdirname) # doctest.ELLIPSIS File ...myfile.pkl already exists! Use overwrite option to overwrite.
>>> v1.value = 51 >>> with tempfile.TemporaryDirectory() as tmpdirname: ... v1.dump(filename='myfile', path=tmpdirname, compress=False) ... v1.dump(filename='myfile', path=tmpdirname, overwrite=True, compress=False) ... v2 = ToyClass.load(filename='myfile', path=tmpdirname) ... dir_content = [file.name for file in Path(tmpdirname).glob('*')] >>> dir_content ['myfile.pkl'] >>> v2.value 51
>>> with tempfile.TemporaryDirectory() as tmpdirname: ... v2 = ToyClass.load(filename='thisfilenamedoesnotexist') Traceback (most recent call last): ... FileNotFoundError: [Errno 2] No such file or directory: ...
- gismo.common.auto_k(data, order=None, max_k=100, target=1.0)[source]#
Proposes a threshold k of significant values according to a relevance vector.
- Parameters:
data (
ndarray
) – Vector with positive relevance values.max_k (int) – Maximal number of entries to return; also number of entries used to determine threshold.
target (float) – Threshold modulation. Higher target means less result. A target set to 1.0 corresponds to using the average of the max_k top values as threshold.
- Returns:
k – Recommended number of values.
- Return type:
Example
>>> data = np.array([30, 1, 2, .3, 4, 50, 80]) >>> auto_k(data) 3
- gismo.common.toy_source_dict = [{'content': 'Gizmo is a Mogwaï.', 'title': 'First Document'}, {'content': 'This is a sentence about Blade.', 'title': 'Second Document'}, {'content': 'This is another sentence about Shadoks.', 'title': 'Third Document'}, {'content': 'This very long sentence, with a lot of stuff about Star Wars inside, makes at some point a side reference to the Gremlins movie by comparing Gizmo and Yoda.', 'title': 'Fourth Document'}, {'content': 'In chinese folklore, a Mogwaï is a demon.', 'title': 'Fifth Document'}]#
A minimal source example where items are
dict
with keys title and content.
- gismo.common.toy_source_text = ['Gizmo is a Mogwaï.', 'This is a sentence about Blade.', 'This is another sentence about Shadoks.', 'This very long sentence, with a lot of stuff about Star Wars inside, makes at some point a side reference to the Gremlins movie by comparing Gizmo and Yoda.', 'In chinese folklore, a Mogwaï is a demon.']#
A minimal source example where items are
str
.