Laboratory#

Lab#

The Lab class glues researchers and publications together.

class gismap.lab.lab.Lab(members, db_dict=None)[source]#
Parameters:
  • members (list of str) – Names of the lab members.

  • db_dict (dict) – Publication DBs to use. Default to all available.

Examples

Use a two-people lab for example.

>>> mini_lab = Lab(['Fabien Mathieu', 'François Baccelli'])

Get DB ids

>>> from gismap import HALAuthor, DBLPAuthor
>>> mini_lab.manual_update([HALAuthor(name='François Baccelli', id='francois-baccelli',
...                                                     aliases=['Francois Baccelli']),
... DBLPAuthor(name='François Baccelli', id='b/FrancoisBaccelli'),
... DBLPAuthor(name='Fabien Mathieu', id='66/2077')])
>>> mini_lab.get_ids()
>>> mini_lab.member_list 
[Member(name='Fabien Mathieu'),
Member(name='François Baccelli')]

There is one entry missing (it was in the warnings). Let us manually set it.

Now we fetch publications:

>>> mini_lab.get_publications()

How many publications per member?

>>> production = [len(a.publications) for a in mini_lab.member_list]
>>> [p >= 100 for p in production]
[True, True]
>>> [p >= 250 for p in production]
[False, True]

Consider one publication.

>>> key = mini_lab.members['Fabien Mathieu'].publications[0]
>>> publi = mini_lab.publications[key]

Use the string property to have a simple bibliography entry:

>>> publi.string 
'Making most voting systems meet the Condorcet criterion reduces their manipulability,
by François Durand, Fabien Mathieu, Ludovic Noirie. unpublished, 2014.'

Use publi_to_text for something more content-oriented:

>>> mini_lab.publi_to_text(key) 
'Making most voting systems meet the Condorcet criterion reduces their manipulability\nSince any non-trivial voting
system is susceptible to manipulation, we investigate how it is possible to reduce the set of situations where it is
manipulable, that is, such that a coalition of voters, by casting an insincere ballot, may secure an outcome that is
better from their point of view. We prove that, for a large class of voting systems, a simple modification allows to
reduce manipulability. This modification is Condorcification: when there is a Condorcet winner, designate her;
otherwise, use the original rule. Our very general framework allows to do this for any voting system, whatever the
form of the original ballots. Hence, when searching for a voting system whose manipulability is minimal, one can
restrict to those that meet the Condorcet criterion.'

Use member_to_text to get the content of a member:

>>> mini_lab.member_to_text("Fabien Mathieu")[:100]
'Making most voting systems meet the Condorcet criterion reduces their manipulability\nSince any non-t'
>>> from collections import Counter
>>> copublis = [k for k, v in Counter(p for m in mini_lab.members.values() for p in m.publications).items() if v>1]
>>> print("\n".join(mini_lab.publications[p].string for p in copublis)) 
On Spatial Point Processes with Uniform Births and Deaths by Random Connection,
by François Baccelli, Fabien Mathieu, Ilkka Norros. unpublished, 2014.
Mutual Service Processes in Euclidean Spaces: Existence and Ergodicity,
by François Baccelli, Fabien Mathieu, Ilkka Norros. Queueing Systems, 2017.
Spatial Interactions of Peers and Performance of File Sharing Systems,
by François Baccelli, Fabien Mathieu, Ilkka Norros. unpublished, 2012.
Can P2P Networks be Super-Scalable?,
by François Baccelli, Fabien Mathieu, Ilkka Norros, Rémi Varloot. IEEE Infocom 2013 - 32nd IEEE International
Conference on Computer Communications, 2013.
Supra-extensibilité des réseaux P2P,
by François Baccelli, Fabien Mathieu, Ilkka Norros, Rémi Varloot. 15èmes Rencontres Francophones sur les Aspects
Algorithmiques des Télécommunications (AlgoTel), 2013.
Performance of P2P Networks with Spatial Interactions of Peers,
by François Baccelli, Fabien Mathieu, Ilkka Norros. CoRR, 2011.
compute_keys()[source]#

Makes a key dictionary so that any name, alias, or db identifier of a member can be linked to her key.

Return type:

None

constructor#

Class attribute: constructor for members of the lab.

alias of Member

get_ids(rewrite=False)[source]#

Get DB identifiers.

Parameters:

rewrite (bool, default=False) – Update even if identifiers are already set.

Return type:

None

get_publications(threshold=0.9, length_impact=0.2)[source]#
  • Retrieve all publications from members in their databases

  • Remove full duplicates

  • Gather pseudo-duplicates and populate publications

  • Populate each member’s publication list with her publications’ keys.

Return type:

None

manual_update(up_list)[source]#

Inject some populated DBAuthors in the lab.

Parameters:

up_list (list of DBAuthor) – Info to inject.

Return type:

None

property member_list#

list List of lab members.

member_to_text(key)[source]#

Simple texter that concatenates all publications of a member (titles and possibly abstracts).

Parameters:

key (str) – Identifier of a member.

Returns:

Member description.

Return type:

str

publi_to_text(key)[source]#

Simple texter that gives title and abstract (if any) from a publication key.

Parameters:

key (str) – Identifier of a publication.

Returns:

Publication description.

Return type:

str

Member#

Handling one single researcher.

class gismap.lab.member.Member(name, pid=None, db_dict=None)[source]#
Parameters:
  • name (str) – Member name.

  • pid (str, optional) – Unique id (in case of homonyyms in the lab)

  • db_dict (dict) – Publication DBs to use. Default to all available.

get_papers(s=None, backoff=False)[source]#

Fetch publications from databases.

Parameters:
  • s (Session, optional) – A session (may be None).

  • backoff (bool, default=False) – Wait between queries.

Returns:

Raw publications. Note that integration of those is made in get_publications(),

Return type:

list

property key#

str Index key of member.

prepare(s=None, backoff=False, rewrite=False)[source]#

Fetch member identifiers in her databases.

Parameters:
  • s (Session, optional) – A session (may be None).

  • backoff (bool, default=False) – Wait between queries.

  • rewrite (bool, default=False) – Update even if identifiers are already set.

Return type:

None

Publication#

Handling one single publication.

class gismap.lab.publication.Publication(raw_list)[source]#
Parameters:

raw_list (list) – Raw sources of the publication. All entries are supposed to refer to the same actual publication.

abstract: str#

Abstract of publication (optional).

authors: list[DBAuthor]#

List of authors.

key: str#

Unique identifier of publication.

static score_raw_publi(paper)[source]#
Parameters:

paper (dict) – Raw publication entry (must have at least the keys origin, venue, type, year).

Returns:

Score to sort the publication.

Return type:

tuple

sources: dict#

Raw dictionaries of publication (one unique dictionary per DB, design choice).

property string#

str Textual description, as in a bibliography.

title: str#

Title of publication.

type: str#

Type of publication (conference, poster, journal, …).

venue: str#

Venue (name of conference/journal)

year: int#

Year of publication.

gismap.lab.publication.score_rosetta = {'origin': {'dblp': 1, 'hal': 2}, 'type': {'conference': 1, 'journal': 2}, 'venue': {'CoRR': -1, 'unpublished': -2}}#

Scoring system to decide the best representative of a publication in case of duplicate.

  • Prefer HAL entries over DBLP entries

  • Arxiv entries are deprecated, unpublished even more

  • Prefer journal version over conference version

  • Implemented in the actual function: use year as final tie-breaker.