Recently Published
Tableau de bord institutionnel UNIGE — Indicateurs OFS 2000–2026
Ce tableau de bord présente les principaux indicateurs institutionnels de l'Université de Genève (UNIGE) à partir des données publiques de l'Office fédéral de la statistique (OFS), extraites en temps réel via l'API STAT-TAB et le package R BFS. Les données couvrent la période 1980–2026 et portent sur cinq dimensions : les effectifs étudiants par domaine d'études, niveau et sexe (SHIS-studex) ; la part d'étudiants étrangers ; les nouveaux entrants en Bachelor ; le personnel académique et administratif en équivalents plein temps (EPT) ; et les indicateurs de coûts par étudiant en comparaison avec les universités de Lausanne, Zurich et la moyenne nationale. Ce travail constitue un proof of concept réalisé dans le cadre d'une candidature au poste de statisticien·ne au Bureau des données institutionnelles et décisionnelles du Rectorat de l'UNIGE. Il illustre la capacité à interroger l'API OFS de manière programmatique, à produire des indicateurs comparatifs inter-universitaires, et à documenter la chaîne de traitement des données — de la livraison SHIS-studex à la visualisation interactive. Sources : OFS STAT-TAB · SHIS-studex · SHIS-FIN · Tables px-x-1502040100_106, px-x-1502040100_107, px-x-1502040100_121, px-x-1504040100_105, px-x-1506030100_211 Outils : R 4.4.1 · Quarto · BFS · pxweb · plotly · DT
ART Coverage Performance visualization, Reimagined
Global health dashboards love averages. But averages hide geography, and geography hides scale.
This note pulls live programmatic data from The Global Fund's API — focusing on ART coverage (the percentage of people on treatment among all people living with HIV) across nationally representative grants closing in H2 2023. The goal: move beyond a single headline number.
To do that, we borrow from psychometrics. The Wright Map — typically used to plot test-takers against item difficulty — turns out to be a surprisingly natural fit for performance data. Here, countries replace test-takers, and performance against target replaces ability scores. The twist: label size scales with the size of the national target, so the visual weight of each country reflects its actual weight in the global portfolio.
The result is a single chart that answers three questions simultaneously: How is performance distributed? Who are the outliers? And whose results move the needle most?
Data Viewer: RStudio vs Positron — A Product Owner's Perspective
I spent years fighting RStudio's data viewer. Eventually I gave up and started dumping dataframes to CSV just to look at them properly.
When Positron came out, I did what any reasonable person would do:
I opened a dataframe, tried to filter it, and took notes.
What I found was a neat little product paradox — more steps, better experience.
As a certified Product Owner, I had thoughts.
IRIS Farification
The famous Iris dataset isn't FAIR—it's just a CSV. No persistent IDs, no ontology links, no provenance, no license. This proof-of-concept fixes that in an afternoon using Airtable and R. We structure the data relationally with taxonomic metadata (NCBI Taxonomy) and measurement ontologies (Plant Ontology), then auto-generate JSON-LD that meets all four FAIR principles. Each of 150 records gets a unique identifier and machine-readable semantics. The approach scales to clinical trials, genomics, or any tabular data. Code included. No enterprise tools required.
Adverse Events Automated tagging with LLM
We used the mistral-large-latest large language model to automatically map AELLT (Lowest Level Term) entries to their corresponding MedDRA System Organ Class (SOC) using adverse event data from the PhUSE CS Working Group 5 CDISC pilot submission.