Utils#

Various functions and classes.

Common#

All-purpose functions.

class gismap.utils.common.LazyRepr[source]#: MixIn that hides empty fields in dataclasses repr’s.

gismap.utils.common.get_classes(root, key='name')[source]#

Parameters:

root (class) – Starting class (can be abstract).
key (str, default=’name’) – Attribute to look-up

Returns:

Dictionaries of all subclasses that have a key attribute (as in class attribute key).

Return type:

dict

Examples

>>> from gismap.sources.models import DB
>>> subclasses = get_classes(DB, key='db_name')
>>> dict(sorted(subclasses.items())) 
{'dblp': <class 'gismap.sources.dblp.DBLP'>,
'hal': <class 'gismap.sources.hal.HAL'>,
'ldb': <class 'gismap.sources.ldb.LDB'>}

gismap.utils.common.list_of_objects(clss, dico, default=None)[source]#

Versatile way to enter a list of objects referenced by a dico.

Parameters:

clss (object) – Object or reference to an object or list of objects / references to objects.
dico (dict) – Dictionary of references to objects.
default (list, optional) – Default list to return if clss is None.

Returns:

Proper list of objects.

Return type:

list

Examples

>>> from gismap.sources.models import DB
>>> subclasses = get_classes(DB, key='db_name')
>>> from gismap import HAL, DBLP
>>> list_of_objects([HAL, 'dblp'], subclasses)
[<class 'gismap.sources.hal.HAL'>, <class 'gismap.sources.dblp.DBLP'>]
>>> list_of_objects(None, subclasses, [DBLP])
[<class 'gismap.sources.dblp.DBLP'>]
>>> list_of_objects(DBLP, subclasses)
[<class 'gismap.sources.dblp.DBLP'>]
>>> list_of_objects('hal', subclasses)
[<class 'gismap.sources.hal.HAL'>]

gismap.utils.common.unlist(x)[source]#

Parameters:: x (str or list or int) – Something.
Returns:: x – If it’s a list, make it flat.
Return type:: str or int

Requests#

Functions related to the requests.

gismap.utils.requests.get(url, params=None, n_trials=10, verify=True)[source]#

Parameters:

url (str) – Entry point to fetch.
params (dict, optional) – Get arguments (appended to URL).
n_trials (int, default=10) – Number of attempts to fetch URL.
verify (bool, default=True) – Verify certificates.

Returns:

Result.

Return type:

str

Logger#

Keep track of things.

gismap.utils.logger.logger = <Logger GisMap (WARNING)>#: Default logging interface.

Zlist#

Convert a list into a succession of compressed frames. Reduces memory footprint at the price of slower random access (sequential access is unaffected).

class gismap.utils.zlist.ZList(frame_size=1000)[source]#

List compressed by frames of elements. Allows to store compressed data in memory with decent seek and scan.

Parameters:: frame_size (int) – Size of each frame in number of elements.

Text#

Text manipulation tools.

class gismap.utils.text.Corrector(voc, score_cutoff=20, min_length=3)[source]#

A simple word corrector base on input vocabulary. Short words are discarded.

Parameters:

voc (list) – Words (each entry may contain multiple words).
score_cutoff (int, default=20) – Threshold for correction.
min_length (int, default=3) – Minimal number of caracters for correction to kick in.

Examples

>>> vocabulary = ['My Taylor Swift is Rich']
>>> phrase = "How riche ise Tailor Swyft"
>>> cor = Corrector(vocabulary, min_length=4)
>>> cor(phrase)
'How rich ise taylor swift'
>>> cor = Corrector(vocabulary, min_length=2)
>>> cor(phrase)
'How rich is taylor swift'

gismap.utils.text.asciify(text)[source]#

Parameters:: text (str) – Some text (typically names) with annoying accents.
Returns:: Same text simplified into ascii.
Return type:: str

Examples

>>> asciify('Ana Bušić')
'Ana Busic'
>>> asciify("Thomas Deiß")
'Thomas Deiss'

gismap.utils.text.clean_aliases(name, alias_list)[source]#

Parameters:

name (str) – Main name.
alias_list (list or set) – Aliases.

Returns:

Aliases deduped, sorted, and with main name removed.

Return type:

list

gismap.utils.text.normalized_name(txt)[source]#

Try to normalize names for facilitating comparisons. Name is lowered, split, asciified, sorted, and filtered.

Parameters:: txt (str)
Return type:: str

Examples

>>> normalized_name("Thomas Deiß")
'deiss thomas'
>>> normalized_name("Dario Rossi 001")
'dario rossi'
>>> normalized_name("James W. Roberts")
'james roberts'

gismap.utils.text.reduce_keywords(kws)[source]#

Remove redundant subparts.

Parameters:: kws (list) – List of words / co-locations.
Returns:: Reduced list
Return type:: list

Examples

>>> reduce_keywords(['P2P', 'Millimeter Waves', 'Networks', 'P2P Networks', 'Waves'])
['Millimeter Waves', 'P2P Networks']

Utils#

Common#

Requests#

Logger#

Zlist#

Text#

This Page