Laboratory#

Management of a group of people and their publications is made with the LabMap abstract class.

LabMaps#

class gismap.lab.labmap.LabMap(name=None, dbs=None)[source]#

Abstract class for labs.

Actual Lab classes can be created by implementing the _author_iterator method.

Labs can be saved with the dump method and loaded with the load method.

Parameters:
  • name (str) – Name of the lab. Can be set as class or instance attribute.

  • dbs (list, default=[HAL, LDB]) – List of DB sources to use.

author_selectors#

Author filters. Default: minimal filtering.

Type:

list

publication_selectors#

Publication filter. Default: less than 10 authors, not an editorial, at least two words in the title.

Type:

list

html(**kwargs)[source]#

Generate HTML representation of the collaboration graph.

Parameters:

**kwargs – Passed to make_vis().

Returns:

HTML content as a string.

Return type:

str

save_html(name=None, **kwargs)[source]#

Save the collaboration graph as an HTML file.

Parameters:
  • name (str, optional) – Output filename. Defaults to lab name.

  • **kwargs – Passed to html().

Return type:

None

show_html(**kwargs)[source]#

Display the collaboration graph in a Jupyter notebook.

Parameters:

**kwargs – Passed to html().

Return type:

None

update_authors(desc='Author information')[source]#

Populate the authors attribute (dict [str, LabAuthor]).

Return type:

None

update_publis(desc='Publications information')[source]#

Populate the publications attribute (dict [str, SourcedPublication]).

Return type:

None

class gismap.lab.labmap.ListMap(author_list, *args, **kwargs)[source]#

Simplest way to create a lab: with a list of names.

Parameters:
  • author_list (list of str) – List of authors names.

  • args (list) – Arguments to pass to the LabMap constructor.

  • kwargs (dict) – Keyword arguments to pass to the LabMap constructor.

EgoMaps#

class gismap.lab.egomap.EgoMap(star, *args, **kwargs)[source]#

Egocentric view of a researcher’s collaboration network.

Displays the star (central researcher), their planets (direct co-authors), and optionally moons (co-authors of co-authors).

Parameters:
  • star (str or LabAuthor) – The central researcher. Can be a name string or LabAuthor object.

  • *args – Passed to LabMap.

  • **kwargs – Passed to LabMap.

Examples

>>> dang = EgoMap("The-Dang Huynh")
>>> dang.build(target=20)
>>> sorted(a.name for a in dang.authors.values() if len(a.name.split())<3)  
['Bruno Kauffmann', 'Diego Perino', 'Dohy Hong', 'Fabien Mathieu', 'François Baccelli',...]
build(**kwargs)[source]#

Build the ego network by fetching publications and adding planets/moons.

Parameters:
  • target (int, default=50) – Target number of authors in the final map.

  • **kwargs – Passed to expand().

Return type:

None

Utilities#

Lab author#

class gismap.lab.lab_author.AuthorMetadata(url: str = None, img: str = None, group: str = None, position: tuple = None)[source]#

Optional information about an author to be used to enhance her presentation.

url#

Homepage of the author.

Type:

str

img#

Url to a picture.

Type:

str

group#

Group of the author.

Type:

str

position#

Coordinates of the author.

Type:

tuple

class gismap.lab.lab_author.LabAuthor(name: str, sources: list = <factory>, metadata: ~gismap.lab.lab_author.AuthorMetadata = <factory>)[source]#

Examples

The metadata and DB key(s) of an author can be entered in parentheses using key/values.

Improper key/values are ignored (with a warning).

>>> dummy= LabAuthor("My Name(img: https://my.url.img, group:me,url:https://mysite.org,hal:key1,ldb:toto,badkey:hello,no_colon_separator)")
>>> dummy.metadata
AuthorMetadata(url='https://mysite.org', img='https://my.url.img', group='me')
>>> dummy.sources
[HALAuthor(name='My Name', key='key1'), LDBAuthor(name='My Name', key='toto')]

You can enter multiple keys for the same DB. HAL key types are automatically detected.

>>> dummy2= LabAuthor("My Name (hal:key1,hal:123456,hal: My Other Name )")
>>> dummy2.sources
[HALAuthor(name='My Name', key='key1'), HALAuthor(name='My Name', key='123456', key_type='pid'), HALAuthor(name='My Name', key='My Other Name', key_type='fullname')]
auto_sources(dbs=None)[source]#

Automatically populate the sources based on author’s name.

Parameters:

dbs (list, default=[HAL, DBLP]) – List of DB sources to use.

Return type:

None

gismap.lab.lab_author.labify_author(author, rosetta)[source]#

Convert a database author to a LabAuthor if possible.

Parameters:
  • author (Author) – Author to convert.

  • rosetta (dict) – Mapping from keys/names to LabAuthor objects.

Returns:

LabAuthor if found in rosetta, otherwise the original author.

Return type:

LabAuthor or original author

gismap.lab.lab_author.labify_publications(pubs, rosetta)[source]#

Convert publication authors to LabAuthors in place.

Parameters:
  • pubs (list) – Publications to update.

  • rosetta (dict) – Mapping from keys/names to LabAuthor objects.

Return type:

None

Expansion#

class gismap.lab.expansion.Member(name: str, key: str)[source]#

Basic information about a lab member for name matching.

Parameters:
  • name (str) – Normalized name.

  • key (str) – Author key.

class gismap.lab.expansion.Prospect(author, strengths)[source]#

Candidate for integration to lab.

Parameters:
  • author (Author) – Reference author. Must have a key.

  • strengths (dict) – Dictionary of ProspectStrength.

class gismap.lab.expansion.ProspectStrength(coauthors: int, publications: int)[source]#

Measures the interaction between an external author and a lab by counting co-authors and publications.

A (max,+) addition is handled to deal with multiple keys.

Examples

>>> a1 = ProspectStrength(3, 5)
>>> a2 = ProspectStrength(2, 10)
>>> a1 > a2
True
>>> a1 + a2
ProspectStrength(coauthors=3, publications=15)
gismap.lab.expansion.count_prospect_entries(lab)[source]#

Associate to external coauthors (prospects) their lab strength.

Parameters:

lab (LabMap) – Reference lab.

Returns:

Lab strengths.

Return type:

dict of str to ProspectStrength

gismap.lab.expansion.get_member_names(lab)[source]#
Parameters:

lab (LabMap) – Reference lab.

Returns:

Tuples simplified-name -> key

Return type:

list

gismap.lab.expansion.get_prospects(lab)[source]#
Parameters:

lab (LabMap) – Reference lab.

Returns:

List of prospects.

Return type:

list of Prospect

gismap.lab.expansion.proper_prospects(lab, length_impact=0.05, threshold=80, n_range=4, max_new=None, trim=True)[source]#

Find and rank external collaborators for potential lab expansion.

Identifies authors from publications who are not already lab members, groups them by name similarity, and ranks by collaboration strength.

Parameters:
  • lab (LabMap) – Reference lab.

  • length_impact (float, default=0.05) – Length impact for name similarity matching.

  • threshold (int, default=80) – Similarity threshold for grouping authors.

  • n_range (int, default=4) – N-gram range for name comparison.

  • max_new (int, optional) – Maximum number of new authors to return.

  • trim (bool, default=True) – If True, keep only one source per database for each author.

Returns:

(existing, new_rosetta) where existing maps external keys to lab member keys, and new_rosetta maps source keys to new LabAuthor objects.

Return type:

tuple

gismap.lab.expansion.trim_sources(author)[source]#

Inplace reduction of sources, keeping one unique source per db.

Parameters:

author (SourcedAuthor) – An author.

Return type:

None

Filters#

gismap.lab.filters.author_taboo_filter(w=None)[source]#
Parameters:

w (list, optional) – List of words to filter.

Returns:

Filter function on authors.

Return type:

Callable

gismap.lab.filters.publication_oneword_filter(n_min=2)[source]#
Parameters:

n_min (int, default=2) – Minimum number of words required in the title.

Returns:

Filter on number of words required in the title.

Return type:

callable

gismap.lab.filters.publication_size_filter(n_max=9)[source]#
Parameters:

n_max (int, default=9) – Maximum number of co-authors allowed.

Returns:

Filter on number of co-authors.

Return type:

callable

gismap.lab.filters.publication_taboo_filter(w=None)[source]#
Parameters:

w (list, optional) – List of words to filter.

Returns:

Filter function on publications.

Return type:

Callable

gismap.lab.filters.re_filter(words)[source]#
Parameters:

words (list or str) – List of word(s) to filter.

Returns:

Filter function.

Return type:

callable