LabMaps#
Note
By default, we query both HAL and LDB. Don’t hesitate to adapt depending on the use case.
Your first lab#
In GisMap, a LabMap is a class whose instances have three methods:
update_authorsautomatically refresh the members of the lab. It is useful at creation or when a lab evolves.update_publicationsmakes a full refresh of the publications of a lab. All publications from lab members are considered (temporal filtering may be enabled later).expandadds moons, i.e. additional researchers that gravitate around the lab.
The simplest usable subclass of Lab is ListMap, which uses a list of names. For example, consider the former team Gangsta from my Bell Labs days.
[8]:
from gismap.lab import ListMap
lab = ListMap(
author_list=[
"Fabien Mathieu",
"Philippe Jacquet",
"Alonso Silva",
"Anne Bouillard",
"François Durand (hal: fradurand, ldb:38/11269)",
"Amira Alloum",
"Marc-Olivier Buob",
"Mohamed Lamine Lamali (hal:mohamed-lamine-lamali, ldb: 43/11358)",
],
name="Gangsta",
)
lab.update_authors()
lab.update_publis()
INFO:GisMap:Multiple entries for Philippe Jacquet in hal
INFO:GisMap:Multiple entries for Alonso Silva in hal
INFO:GisMap:Multiple entries for Anne Bouillard in hal
Maps can be saved with the dump method so you don’t have to re-update them all the time.
When you have a populated lab, you can produce a standalone HTML of the collaboration graph with save_html. That graph is a standalone HTML that can be displayed in a notebook or saved for inclusion in a web page (e.g. with iframe).
You can also display it directly inside a notebook. Using options, you can perform some customization if you want:
[9]:
groups = {"Gangsta": {"color": "rgb(255, 0, 255)"}}
lab.show_html(groups=groups)
Let’s add some context with a few moons.
[10]:
lab.expand(target=4)
[11]:
groups["moon"] = {"display": "Usual Suspects", "color": "rgb(0,255,0)"}
lab.show_html(groups=groups)
Few things about the generated graph:
Authors are represented with their initials unless some picture url is provided (implicitly or explicitly).
Comets are singletons (authors with no co-publications with the other nodes). They are hidden by default. For example, if you only show the moons / usual suspects, Bernard becomes a comet and is hidden.
You can hover an author to get her name. If you click, you get a modal with the list of publications.
The width and length of an edge depend on the number of co-publications. If you click you get a modal with the list of co-publications.
The menu (top-left ☰) groups graph-level actions:
Redraw,Full Screen,Show/Hide Legend,Download <lab>.bib(whole-lab BibTeX),Download PNG,Copy PNG to clipboard. The bottom-right expand/compress icon is a shortcut for Full Screen.Inside any modal, each publication has a
[.bib]toggle (and[abstract]when available — typically HAL entries) revealing inline content with a hover-to-copy button. The modal header carries aDownload .bibbutton that exports just the listed publications (one author’s, or one author-pair’s joint output).
Exporting a lab#
Beyond the interactive HTML, a populated lab can be serialized for downstream use: BibTeX for citation managers, JSON for any structured pipeline, CSV for spreadsheets and light analytics.
Every export takes an optional name= argument. When omitted, the lab’s own name= (set at construction) is used as the filename stem, and the file lands in the current working directory — i.e. a real session would just call lab.to_bib() and find Gangsta.bib next to the notebook. To keep this tutorial’s workspace clean we redirect everything to a TemporaryDirectory instead:
[12]:
import tempfile
from pathlib import Path
# TemporaryDirectory cleans itself when the object is garbage-collected;
# we'll also call .cleanup() explicitly at the end of the section.
tmp = tempfile.TemporaryDirectory()
out = Path(tmp.name)
[13]:
# Whole-lab BibTeX, written to our tempdir as Gangsta.bib
lab.to_bib(name=out / "Gangsta")
len(lab.publications)
[13]:
835
Both to_bib and the Download <lab>.bib menu entry can be restricted via query=. A string is matched by exact key or fuzzy title similarity; a callable is used as a predicate f(pub) -> bool. The same logic powers LabMap.select_publications, which is handy to preview what will be exported:
[14]:
# Filtered BibTeX — recent publications, written as Gangsta_recent.bib
recent = lambda p: p.year is not None and p.year >= 2015
lab.to_bib(query=recent, name=out / "Gangsta_recent")
[p.short_str() for p in lab.select_publications(recent)][:10]
INFO:GisMap:354 publications found.
INFO:GisMap:354 publications found.
[14]:
['"torus packing for multisets." (2024, misc) - https://doi.org/10.4230/ARTIFACTS.22479',
'"Tree Walks and the Spectrum of Random Graphs." (2024, conference) - https://doi.org/10.4230/LIPICS.AOFA.2024.11',
'"Combinatorics of nondeterministic walks of the Dyck and Motzkin type" (2019, conference) - https://hal.science/hal-01910727v1',
'"Graphs with degree constraints." (2016, conference) - https://doi.org/10.1137/1.9781611974324.4',
'"Phase transition of random non-uniform hypergraphs." (2015, journal) - https://doi.org/10.1016/J.JDA.2015.01.009',
'"2-Xor Revisited: Satisfiability and Probabilities of Functions." (2016, journal) - https://doi.org/10.1007/S00453-016-0119-X',
'"Active clustering for labeling training data." (2021, conference) - https://proceedings.neurips.cc/paper/2021/hash/47841cc9e552bd5c40164db7073b817b-Abstract.html',
'"Of Kernels and Queues: when network calculus meets analytic combinatorics" (2018, conference) - https://hal.science/hal-01889101v1',
'"Robot Positioning Using Torus Packing for Multisets." (2024, conference) - https://doi.org/10.4230/LIPICS.ICALP.2024.43',
'"Threshold functions for small subgraphs in simple graphs and multigraphs." (2020, journal) - https://doi.org/10.1016/J.EJC.2020.103113']
JSON serializes authors and publications in a structured form (see Author.to_dict() / Publication.to_dict()). CSV produces two files, <name>_authors.csv and <name>_publications.csv, with | as a separator for multi-valued cells:
[15]:
lab.to_json(name=out / "Gangsta") # Gangsta.json
lab.to_csv(name=out / "Gangsta") # Gangsta_authors.csv and Gangsta_publications.csv
And of course, the HTML export:
[16]:
lab.save_html(name=out / "Gangsta") # Gangsta.html
Listing what we just produced, then wiping the tempdir. The explicit tmp.cleanup() keeps things tidy if the kernel stays alive for a while; even if you forget it, garbage collection of tmp would trigger the same cleanup.
[17]:
sorted(p.name for p in out.iterdir())
[17]:
['Gangsta.bib',
'Gangsta.html',
'Gangsta.json',
'Gangsta_authors.csv',
'Gangsta_publications.csv',
'Gangsta_recent.bib']
[18]:
tmp.cleanup()
How-to: Adding informal “publications”#
Since version 0.5.2, you can use the add_publication method to manually add publications to a lab. Author names are automatically resolved to known authors using fuzzy matching.
First we need to build a lab.
[19]:
from gismap.lab import ListMap
lab = ListMap(author_list=["Fabien Mathieu", "Céline Comte"], name="Dream Team")
lab.update_authors()
lab.update_publis()
Then we just call add_publication with a title and a list of author names. Known authors are matched automatically; unknown ones become Outsiders.
[20]:
lab.add_publication(
title="Informal discussions on GisMap",
authors=["Fabien Mathieu", "Céline Comte", "John Doe"],
year=2026,
venue="Zoom meetings",
)
lab.show_html()
Make your own LabMap#
GisMap is intended to make easy the creation of LabMaps in many contexts.
The easiest way to manage a lab, apart from using ListMaps as shown above, is to specify an internal method _author_iterator that returns Lab authors. When it’s done, you can create/refresh LabMaps as you see fit.
How the iterator works is 100% up to you. Most of the time, this is done by scrapping some Web page(s) (see the gallery for examples), but many other options exist, e.g. read authors from a file, from a LDAP…
For example, this is the entire code required for handling the Solace team.