LabMaps#

Note

For small tests, we query both HAL and DBLP (default behavior), but for large retrievals we use HAL only, as it is faster.

Lab authors#

Lab authors are the main ingredient to analyse a single lab (i.e. a group of researchers). You can create one just with a name and then automatically ask to retrieve the DB endpoints for this author.

[1]:

from gismap.lab import LabAuthor

maria = LabAuthor("Maria Potop")
maria.auto_sources()

Let’s look at the sources:

[2]:

maria.sources

[2]:

[HALAuthor(name='Maria Potop', key='841868', key_type='pid'),
 LDBAuthor(name='Maria Popa', key='197/8353'),
 LDBAuthor(name='María Poó', key='133/9693'),
 LDBAuthor(name='Maria Potop-Butucaru', key='p/MariaPotopButucaru'),
 LDBAuthor(name='Maria Copot', key='353/9542'),
 LDBAuthor(name='Maria Fotopoulou', key='306/4372')]

Note that an author can have many names.

[3]:

maria.aliases

[3]:

['Maria Copot',
 'Maria Fotopoulou',
 'Maria Gradinariu Potop-Butucaru',
 'Maria Poo',
 'Maria Popa',
 'Maria Potop Butucaru',
 'Maria Potop-Butucaru',
 'María Poó']

When using auto_source, you can tell which DBs should be uses (only online DBLP and HAL are available right now).

[4]:

celine = LabAuthor("Céline Comte")
celine.auto_sources(dbs="dblp")
celine.sources

[4]:

[DBLPAuthor(name='Céline Comte', key='179/2173')]

[5]:

celine = LabAuthor("Céline Comte")
celine.auto_sources(dbs=["hal"])
celine.sources

WARNING:GisMap:Connection error. Auto-retry in 6 seconds.

[5]:

[HALAuthor(name='Céline Comte', key='celine-comte')]

When the sources of an author are set one can retrieve her publications.

[6]:

[p for p in celine.get_publications().values() if p.year == 2021]

[6]:

[SourcedPublication(title='Modèle de couplage stochastique non-biparti', authors=[LabAuthor(name='Céline Comte')], venue='ALGOTEL 2021 - 23èmes Rencontres Francophones sur les Aspects Algorithmiques des Télécommunications', type='conference', year=2021),
 SourcedPublication(title='Pass-and-Swap Queues', authors=[LabAuthor(name='Céline Comte'), HALAuthor(name='Jan-Pieter Dorsman', key='1098513', key_type='pid')], venue='Queueing Systems', type='journal', year=2021),
 SourcedPublication(title='Load Balancing in Heterogeneous Server Clusters: Insights From a Product-Form Queueing Model', authors=[HALAuthor(name='Mark van Der Boor', key='Mark van Der Boor', key_type='fullname'), LabAuthor(name='Céline Comte')], venue='2021 IEEE/ACM 29th International Symposium on Quality of Service (IWQOS)', type='conference', year=2021),
 SourcedPublication(title='Performance Evaluation of Stochastic Bipartite Matching Models', authors=[LabAuthor(name='Céline Comte'), HALAuthor(name='Jan-Pieter Dorsman', key='1098513', key_type='pid')], venue='Performance Engineering and Stochastic Modeling', type='chapter', year=2021)]

Note

Lab authors can have metadata that can be used for display and further analysis. This is not covered in this tutorial.

Your first lab#

In GisMap, a LabMap is a class whose instances have three methods:

update_authors automatically refresh the members of the lab. It is useful at creation or when a lab evolves.
update_publications makes a full refresh of the publications of a lab. All publications from lab members are considered (temporal filtering may be enabled later).
expand adds moons, i.e. additional researchers that gravitate around the lab.

The simplest usable subclass of Lab is ListMap, which uses a list of names. For example, consider the former team Gangsta from my Bell Labs days.

[7]:

from gismap.lab import ListMap

lab = ListMap(
    author_list=[
        "Fabien Mathieu",
        "Philippe Jacquet",
        "Alonso Silva",
        "Anne Bouillard",
        "François Durand (hal: fradurand)",
        "Amira Alloum",
        "Marc-Olivier Buob",
        "Mohamed Lamine Lamali",
    ],
    name="Gangsta",
    dbs="hal",
)
lab.update_authors()
lab.update_publis()
len(lab.publications)

[7]:

Maps can be saved withe the dump method so you don’t have to re-update them all the time.

When you have a populated lab, you can produce a standalone HTML of the collaboration graph with save_html. That graph is a standalone HTML that can be displayed in a notebook or saved for inclusion in a web page (e.g. with iframe).

You can also display it directly inside a notebook:

[8]:

lab.show_html()

Show Comets

Let’s add some context with a few moons.

[9]:

lab.expand(target=4)

[10]:

lab.show_html()

Gangsta Moons Show Comets

Few things about the generated graph:

Authors are represented with their initials unless some picture url is provided (implicitly or explictly).
Comets are singletons (authors with no co-publications with the other nodes). They are hidden by default. For example, if you only show the moons, Bernard becomes a comet and is hidden.
You can hover an author to get her name. If you click, you have a modal with the list of publications.
The width and length of an edge depend on the number of co-publications. If you click you have a modal with the list of co-publications.

Make your own LabMap#

GisMap is intended to make easy the creation of LabMaps in many contexts.

The easiest way to manage a lab, apart from using ListMaps as shown above, is to specify an internal method _author_iterator that returns Lab authors. When it’s done, you can create/refresh LabMaps as you see fit.

How the iterator works is 100% up to you. Most of the time, this is done by scrapping some Web page(s) (see the gallery for examples), but many other options exist, e.g. read authors from a file, from a LDAP…

For example, this is the entire code required for handling the Solace team.

LabMaps#

Lab authors#

Your first lab#

Make your own LabMap#

This Page