Co-publication graph#

Lab authors#

Lab authors are the main ingredient to analyse a single lab (i.e. a group of researchers). You can create one just with a name and then automatically ask to retrieve the DB endpoints for this author.

[1]:
from gismap.lab import LabAuthor

maria = LabAuthor("Maria Potop")
maria.auto_sources()
WARNING:GisMap:Multiple entries for Maria Potop in hal

We see a warning here. Let’s look at the sources:

[2]:
maria.sources
[2]:
[HALAuthor(name='Maria Potop', key='858256', key_type='pid'),
 HALAuthor(name='Maria Potop', key='841868', key_type='pid'),
 DBLPAuthor(name='Maria Potop', key='p/MariaPotopButucaru')]

This is actually normal: Maria has multiple identities in Hal. The warning is there to tell there is a possibility of homonyms but that is not the case here. Note that an author can have many names.

[3]:
maria.aliases
[3]:
['Maria Gradinariu', 'Maria Gradinariu Potop-Butucaru', 'Maria Potop-Butucaru']

When using auto_source, you can tell which DBs should be uses (only online DBLP and HAL are available right now).

[4]:
from gismap.sources.hal import HAL

celine = LabAuthor("Céline Comte")
celine.auto_sources(dbs=[HAL])
celine.sources
[4]:
[HALAuthor(name='Céline Comte', key='celine-comte')]

When the sources of an author are set one can retrieve her publications.

[5]:
celine.get_publications()
[5]:
{'2118156': SourcedPublication(title="0 = 0, c'est le truc du noyau ! Application aux files d'attente", authors=[HALAuthor(name='Anne Bouillard', key='anne-bouillard'), LabAuthor(name='Céline Comte', metadata=AuthorMetadata()), HALAuthor(name='Élie de Panafieu', key='Élie de Panafieu', key_type='fullname'), HALAuthor(name='Fabien Mathieu', key='fabien-mathieu')], venue='ALGOTEL 2019 - 21èmes Rencontres Francophones sur les Aspects Algorithmiques des Télécommunications', type='conference', year=2019, key='2118156', url='https://hal.science/hal-02118156v1'),
 '1889101': SourcedPublication(title='Of Kernels and Queues: when network calculus meets analytic combinatorics', authors=[HALAuthor(name='Anne Bouillard', key='anne-bouillard'), LabAuthor(name='Céline Comte', metadata=AuthorMetadata()), HALAuthor(name='Élie de Panafieu', key='Élie de Panafieu', key_type='fullname'), HALAuthor(name='Fabien Mathieu', key='fabien-mathieu')], venue='NetCal 2018', type='conference', year=2018, key='1889101', url='https://hal.science/hal-01889101v1'),
 '2413496': SourcedPublication(title='Resource management in computer clusters : algorithm design and performance analysis', authors=[LabAuthor(name='Céline Comte', metadata=AuthorMetadata())], venue='unpublished', type='thesis', year=2019, key='2413496', url='https://pastel.hal.science/tel-02413496v1'),
 '1306343': SourcedPublication(title='Performance of a Server Cluster with Parallel Processing and Randomized Load Balancing', authors=[HALAuthor(name='Thomas Bonald', key='tbonald'), LabAuthor(name='Céline Comte', metadata=AuthorMetadata())], venue='unpublished', type='report', year=2016, key='1306343', url='https://hal.science/hal-01306343v1'),
 '5076337': SourcedPublication(title='Optimizing Asynchronous Federated Learning: A Delicate Trade-Off Between Model-Parameter Staleness and Update Frequency', authors=[HALAuthor(name='Abdelkrim Alahyane', key='1504678', key_type='pid'), LabAuthor(name='Céline Comte', metadata=AuthorMetadata()), HALAuthor(name='Matthieu Jonckheere', key='matthieu-jonckheere'), HALAuthor(name='Éric Moulines', key='1350242', key_type='pid')], venue='unpublished', type='report', year=2025, key='5076337', url='https://hal.science/hal-04938472v2'),
 '2299321': SourcedPublication(title='Performance of Balanced Fairness in Resource Pools: A Recursive Approach', authors=[HALAuthor(name='Thomas Bonald', key='tbonald'), LabAuthor(name='Céline Comte', metadata=AuthorMetadata()), HALAuthor(name='Fabien Mathieu', key='fabien-mathieu')], venue='Proceedings of the ACM on Measurement and Analysis of Computing Systems', type='journal', year=2017, key='2299321', url='https://inria.hal.science/hal-01630420v3'),
 '2052607': SourcedPublication(title="Kleinberg's grid unchained", authors=[LabAuthor(name='Céline Comte', metadata=AuthorMetadata()), HALAuthor(name='Fabien Mathieu', key='fabien-mathieu')], venue='Theoretical Computer Science', type='journal', year=2020, key='2052607', url='https://inria.hal.science/hal-02052607v1'),
 '1581786': SourcedPublication(title='Balanced Fair Resource Sharing in Computer Clusters', authors=[HALAuthor(name='Thomas Bonald', key='tbonald'), LabAuthor(name='Céline Comte', metadata=AuthorMetadata())], venue='Performance Evaluation', type='journal', year=2017, key='1581786', url='https://hal.science/hal-01581786v1'),
 '3219422': SourcedPublication(title='Modèle de couplage stochastique non-biparti', authors=[LabAuthor(name='Céline Comte', metadata=AuthorMetadata())], venue='ALGOTEL 2021 - 23èmes Rencontres Francophones sur les Aspects Algorithmiques des Télécommunications', type='conference', year=2021, key='3219422', url='https://hal.science/hal-03219422v1'),
 '2340255': SourcedPublication(title='Dynamic load balancing with tokens', authors=[LabAuthor(name='Céline Comte', metadata=AuthorMetadata())], venue='Computer Communications', type='journal', year=2019, key='2340255', url='https://hal.science/hal-02340255v1'),
 '1517150': SourcedPublication(title='À la racine du parallélisme', authors=[HALAuthor(name='Thomas Bonald', key='tbonald'), LabAuthor(name='Céline Comte', metadata=AuthorMetadata()), HALAuthor(name='Fabien Mathieu', key='fabien-mathieu')], venue='ALGOTEL 2017 - 19èmes Rencontres Francophones sur les Aspects Algorithmiques des Télécommunications', type='conference', year=2017, key='1517150', url='https://hal.science/hal-01517150v1'),
 '2328981': SourcedPublication(title='Poly-symmetry in processor-sharing systems', authors=[HALAuthor(name='Thomas Bonald', key='tbonald'), LabAuthor(name='Céline Comte', metadata=AuthorMetadata()), HALAuthor(name='Virag Shah', key='Virag Shah', key_type='fullname'), HALAuthor(name='Gustavo de Veciana', key='Gustavo de Veciana', key_type='fullname')], venue='Queueing Systems', type='journal', year=2017, key='2328981', url='https://hal.science/hal-01513544v2'),
 '4780574': SourcedPublication(title='Online Stochastic Matching: A Polytope Perspective', authors=[LabAuthor(name='Céline Comte', metadata=AuthorMetadata()), HALAuthor(name='Fabien Mathieu', key='fabien-mathieu'), HALAuthor(name='Sushil Mahavir Varma', key='1440476', key_type='pid'), HALAuthor(name='Ana Bušić', key='anabusic')], venue='unpublished', type='report', year=2024, key='4780574', url='https://hal.science/hal-03502084v5'),
 '1773674': SourcedPublication(title='Un seul serveur vous manque, et tout est découplé !', authors=[HALAuthor(name='Thomas Bonald', key='tbonald'), LabAuthor(name='Céline Comte', metadata=AuthorMetadata()), HALAuthor(name='Fabien Mathieu', key='fabien-mathieu')], venue='ALGOTEL 2018 - 20èmes Rencontres Francophones sur les Aspects Algorithmiques des Télécommunications', type='conference', year=2018, key='1773674', url='https://hal.science/hal-01773674v1'),
 '4612740': SourcedPublication(title='Score-Aware Policy-Gradient Methods and Performance Guarantees using Local Lyapunov Conditions', authors=[LabAuthor(name='Céline Comte', metadata=AuthorMetadata()), HALAuthor(name='Matthieu Jonckheere', key='matthieu-jonckheere'), HALAuthor(name='Jaron Sanders', key='Jaron Sanders', key_type='fullname'), HALAuthor(name='Albert Senen-Cerda', key='Albert Senen-Cerda', key_type='fullname')], venue='unpublished', type='report', year=2024, key='4612740', url='https://hal.science/hal-04329790v2'),
 '3224101': SourcedPublication(title='Pass-and-Swap Queues', authors=[LabAuthor(name='Céline Comte', metadata=AuthorMetadata()), HALAuthor(name='Jan-Pieter Dorsman', key='1098513', key_type='pid')], venue='Queueing Systems', type='journal', year=2021, key='3224101', url='https://hal.science/hal-03224101v1'),
 '3331759': SourcedPublication(title='Load Balancing in Heterogeneous Server Clusters: Insights From a Product-Form Queueing Model', authors=[HALAuthor(name='Mark van Der Boor', key='Mark van Der Boor', key_type='fullname'), LabAuthor(name='Céline Comte', metadata=AuthorMetadata())], venue='2021 IEEE/ACM 29th International Symposium on Quality of Service (IWQOS)', type='conference', year=2021, key='3331759', url='https://hal.science/hal-03331759v1'),
 '2118170': SourcedPublication(title='Rien ne sert de prédire ; il faut servir ancien.', authors=[LabAuthor(name='Céline Comte', metadata=AuthorMetadata())], venue='ALGOTEL 2019 - 21èmes Rencontres Francophones sur les Aspects Algorithmiques des Télécommunications', type='conference', year=2019, key='2118170', url='https://hal.science/hal-02118170v1'),
 '1517123': SourcedPublication(title='La Grille de Kleinberg, l’Univers et le Reste', authors=[LabAuthor(name='Céline Comte', metadata=AuthorMetadata()), HALAuthor(name='Fabien Mathieu', key='fabien-mathieu')], venue='ALGOTEL 2017 - 19èmes Rencontres Francophones sur les Aspects Algorithmiques des Télécommunications', type='conference', year=2017, key='1517123', url='https://hal.science/hal-01517123v1'),
 '3507517': SourcedPublication(title='Stochastic Non-Bipartite Matching Models and Order-Independent Loss Queues', authors=[LabAuthor(name='Céline Comte', metadata=AuthorMetadata())], venue='Stochastic Models', type='journal', year=2022, key='3507517', url='https://hal.science/hal-03468064v2'),
 '4956887': SourcedPublication(title='Graph-Based Product Form', authors=[LabAuthor(name='Céline Comte', metadata=AuthorMetadata()), HALAuthor(name='Isaac Grosof', key='1510244', key_type='pid')], venue='unpublished', type='report', year=2025, key='4956887', url='https://hal.science/hal-04956887v1'),
 '3507566': SourcedPublication(title='Performance Evaluation of Stochastic Bipartite Matching Models', authors=[LabAuthor(name='Céline Comte', metadata=AuthorMetadata()), HALAuthor(name='Jan-Pieter Dorsman', key='1098513', key_type='pid')], venue='Performance Engineering and Stochastic Modeling', type='chapter', year=2021, key='3507566', url='https://hal.science/hal-03468055v2'),
 '1314992': SourcedPublication(title='The multi-source model for dimensioning data networks', authors=[HALAuthor(name='Thomas Bonald', key='tbonald'), LabAuthor(name='Céline Comte', metadata=AuthorMetadata())], venue='Computer Networks', type='journal', year=2016, key='1314992', url='https://hal.science/hal-01314992v1'),
 '5086684': SourcedPublication(title='Arrival Control in Quasi-Reversible Queueing Systems: Optimization and Reinforcement Learning', authors=[LabAuthor(name='Céline Comte', metadata=AuthorMetadata()), HALAuthor(name='Pascal Moyal', key='pascal-moyal')], venue='unpublished', type='report', year=2025, key='5086684', url='https://hal.science/hal-05074406v2')}

Lab authors can have metadata that can be used for display and further analysis but we will not cover that in this tutorial.

Your first lab#

In GISMAP, a Lab is a class whose instances have two methods:

  • update_authors automatically refresh the members of the lab. It is useful at creation or when a lab evolves.

  • update_publications makes a full refresh of the publications of a lab. All publications from lab members are considered (temporal filtering may be enabled later).

The simplest usable subclass of Lab is ListLab, which uses a list of names. For example, consider the executive committee of the LINCS lab plus Fabien Mathieu (GISMAP author).

[6]:
from gismap.lab import ListLab

lab = ListLab(
    author_list=[
        "Tixeuil Sébastien",
        "Mathieu Fabien",
        "Kofman Daniel",
        "Baccelli François",
        "Noirie Ludovic",
        "Bassi Francesca",
    ],
    name="toy_example",
)
lab.update_authors()
lab.authors
[6]:
{'tixeuil': LabAuthor(name='Tixeuil Sébastien', metadata=AuthorMetadata()),
 'fabien-mathieu': LabAuthor(name='Mathieu Fabien', metadata=AuthorMetadata()),
 'daniel-kofman': LabAuthor(name='Kofman Daniel', metadata=AuthorMetadata()),
 'francois-baccelli': LabAuthor(name='Baccelli François', metadata=AuthorMetadata()),
 'ludovic-noirie': LabAuthor(name='Noirie Ludovic', metadata=AuthorMetadata()),
 'francesca-bassi': LabAuthor(name='Bassi Francesca', metadata=AuthorMetadata())}
[7]:
lab.update_publis()
len(lab.publications)
[7]:
939

Labs can be saved to you don’t have to re-update them all the time.

[8]:
lab.dump(lab.name)
File toy_example.pkl.zst already exists! Use overwrite option to overwrite.

When you have a populated lab, you can use lab2graph to create the collaboration graph. That graph is a standalone HTML that can be displayed in a notebook or saved for inclusion in a web page (iframe is recommended then).

[9]:
from gismap.lab import lab2graph
from IPython.display import display, HTML

display(HTML(lab2graph(lab)))

Few things about the generated graph:

  • Singletons (authors with no co-publications) are discarded by default.

  • Authors are represented with their initials unless some picture url is provided.

  • You can hover an author to get her name. If you click, you have a modal with the list of publications.

  • The width and length of an edge depend on the number of co-publications. If you click you have a modal with the list of co-publications.

Make your own lab#

The easiest way to manage a lab is to specify an internal method _author_iterator that returns Lab authors.

To GISMAP a lab, you just need to specify that method. Most of the time, this is done by scrapping some Web page(s). See the references for examples.

Example#

The LaasLab class automatically builds a lab representation from https://www.laas.fr/fr/equipes/*team_name*/

[10]:
from gismap.lab import LaasLab

display(HTML(lab2graph(LaasLab.load("sara"))))