Martin Gubri

Research Lead

Parameter Lab

Biography

I am the research lead at Parameter Lab in Tübingen, Germany. My work focuses on auditing the hidden risks of black-box AI systems to make them more safe, transparent, and accountable. Through systematic external testing of large language models (LLMs) and computer vision models, I develop methods to independently assess risks and ensure compliance without requiring access to model internals.

I hold a PhD in computer science from the University of Luxembourg, where my thesis focused on adversarial machine learning. I am a data scientist who graduated from Ensae ParisTech and a statistician who graduated from Toulouse School of Economics.

I am considering opportunities in both academia and industry. If my profile matches what you are looking for, I would be glad to hear from you. You can view my CV here.

Interests

Trustworthy AI
Machine Learning Auditing
Large Language Model
Adversarial Machine Learning
Privacy and Security

Education

PhD in Computer Science, 2023

University of Luxembourg (Luxembourg)
Specialized Master's in Data Science, 2015

Ensae ParisTech (France)
Dual Master's Degrees in Statistics and Econometrics, 2014

Toulouse School of Economics and Paul Sabatier University (France)
Bachelor's in Economics and Mathematics, 2012

Toulouse School of Economics (France)

News

December 2025

I have been appointed Area Chair for UAI 2026.
I will attend NeurIPS 2025 in San Diego to present C-SEO Bench

November 2025

I will give my talk titled “Revealing the Invisible: Auditing the Hidden Risks of Black-Box LLMs” at the University of Luxembourg (Luxembourg) on November 20th at 3pm

October 2025

I have been appointed Area Chair for ARR ACL.
I had the pleasure of giving a talk titled “Revealing the Invisible: Auditing the Hidden Risks of Black-Box LLMs” at the University of Mannheim (Germany) and the University of Trento (Italy)
Our new Dr.LLM paper was selected as one of the top AI papers of the week by DAIR.AI!

September 2025

Two papers that I had the pleasure of supervising were accepted: C-SEO Bench at NeurIPS 2025 D&B and Leaky Thoughts at EMNLP 2025

June 2025

Our new Leaky Thoughts paper was selected as one of the top AI papers of the week by DAIR.AI and The AI Timeline!

Selected Publications

TRAP: Targeted Random Adversarial Prompt Honeypot for Black-Box Identification

The paper introduces Targeted Random Adversarial Prompt (TRAP), an LLM fingerprinting method, to identify a specific LLM in third-party applications.

Martin Gubri, Dennis Ulmer, Hwaran Lee, Sangdoo Yun, Seong Joon Oh

ACL 2024 (findings)

Details PDF Slides Video Code Poster

Publications

Is Multilingual LLM Watermarking Truly Multilingual? A Simple Back-Translation Solution
Details PDF Code
DISCO: Diversifying Sample Condensation for Accelerating Model Evaluation
Details PDF Code Website
Dr.LLM: Dynamic Layer Routing in LLMs
Details PDF Code
Leaky Thoughts: Large Reasoning Models Are Not Private Thinkers
Details PDF Slides Code Dataset
C-SEO Bench: Does Conversational SEO Work?
Details PDF Slides Code Dataset Poster
Social Science Is Necessary for Operationalizing Socially Responsible Foundation Models
Details PDF Poster
Testing Uniform Random Samplers: Methods, Datasets and Protocols
Details PDF Code
Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models
Details PDF Code Dataset Poster
Calibrating Large Language Models Using Their Generations Only
Details PDF Slides Code Poster Models
TRAP: Targeted Random Adversarial Prompt Honeypot for Black-Box Identification
Details PDF Slides Video Code Poster
ProPILE: Probing Privacy Leakage in Large Language Models
Details PDF Slides Video Poster
What Matters in Model Training to Transfer Adversarial Examples
Details PDF Slides Video
Going Further: Flatness at the Rescue of Early Stopping for Adversarial Example Transferability
Details PDF Code
LGV: Boosting Adversarial Example Transferability from Large Geometric Vicinity
Details PDF Slides Video Code Poster
Efficient and Transferable Adversarial Examples from Bayesian Neural Networks
Details PDF Code Poster
Influence-driven data poisoning in graph-based semi-supervised classifiers
Details PDF Slides Code
Search-Based Adversarial Testing and Improvement of Constrained Credit Scoring Systems
Details PDF Slides Video Code
Adversarial perturbation intensity strategy achieving chosen intra-technique transferability level for logistic regression
Details PDF Code

Development

FLOSS Contributions

Technical Supervisor

MASEval. Supervised the development of MASEval, an open-source LLM-based multi-agent evaluation framework.

Significant Contributions

Adversarial Robustness Toolbox (ART)
Torchattacks
Implementation of state-of-the-art predictors for Spatial Regression Models in the spdep R package
Integration of sentences in French into Common Voice (epub parsing and cleaning)

Minor Contributions

Teaching

2023 – 2025

Master: Erasmus Mundus Joint Master in Cybersecurity

The Erasmus Mundus Joint Master in cybersecurity (CYBERUS) at the University of Luxembourg. One session about adversarial examples against LLMs per year (2023, 2024, 2025).

Before Class: Preparation (15 mins)

Read these short resources before our session:

GCG paper abstract (2 mins)
Watch: What Is a Prompt Injection Attack? (10 mins)
LLM alignment defintion (3 mins)

Reflection question: Why LLM developers might want to prevent certain outputs from their LLMs?

Class Content

Slides.
Notebook showcasing jailbreaking: Google Colab, ipynb file.

2022

Master: Advanced topics in Applied Machine Learning

2nd year of Master in Computer Science. Two lectures, design and correction of the project, planning.

Recalls of machine learning, machine learning frameworks, first part project
Overview of adversarial machine learning, model calibration

Project: Creation and evaluation of fill-in-the-blank notebooks (part 1 on preprocessing and part 3 on adversarial examples)

Master: Introduction to Machine Learning

2nd year of Master in Space Science. Six sessions, including four based on the Machine Learning Refined book and one based on the Applied Machine Learning course of Andreas C. Müller. Summary slides.

Recalls of linear algebra: “Machine Learning Refined” book, “Essence of linear algebra” videos serie of 3blue1brown, some formal definitions from the “Mathematics for Machine Learning” book.
Zero-order optimization: Chapter 2, Appendix B.
First-order optimization: Chapter 3, “Gradient descent, how neural networks learn” video from 3Blue1Brown.
Linear regression and linear classification: Chapters 5 and 6
ML project lifecycle: Data preparation, feature engineering, overfitting & underfitting, model evaluation. Slides.
Neural Networks (slides), Keras & Convolutional Neural Nets (slides) and Advanced Neural Networks (slides).
Written examination

2021

Master: Introduction to Machine Learning

2nd year of Master in Space Science. Two introductory lectures on Machine Learning. Slides.

2020

Bachelor: Software engineering 2

3rd year of Bachelor in Computer Science. Four introductory lectures on Machine Learning Engineering. Course given online during lockdown. Quizzes on Moodle. Videos, Slides

Introduction to Machine Learning: Useful Definitions, Types of Tasks in Machine Learning
Introduction to Machine Learning: Recalls of Statistics, Model’s elements, Elements of Statistical Learning Theory
Machine Learning Project Lifecycle: When to (not) use Machine Learning, Goal Definition, Data Collection & Preparation
Machine Learning Project Lifecycle: Feature Engineering, Choosing and Training a model, Model Evaluation, Feedback loop

Academic Services

Meta-Reviewer

Area Chair for UAI 2026
Area Chair for ARR ACL (Oct. 2025, Jan. 2026)

Reviewer

I served as a reviewer for the following venues.

ACL ARR (May 2025)
UAI (2024, 2023)
International Journal of Computer Vision
IEEE Transactions on Neural Networks and Learning Systems
IEEE Transactions on Pattern Analysis and Machine Intelligence
SiMLA Workshop (2023)

Miscellaneous

I organized and animated the weekly Machine Learning Reading Group at the SerVal group (University of Luxembourg) from February 2021 to August 2023.

White-Hat

Contributions to FLOSS Security

Vulnerabilities Discovered

CVE	Software	Type	Description/Impact	Links
CVE-2017-6877	Lutim	Stored XSS	Exposed all images uploaded by the user and their encryption keys	issue
CVE-2017-10975	Lutim	Stored XSS	Idem. Hard to exploit in pratice	issue
CVE-2017-1000051	CryptPad	Stored XSS	Exposed encryption keys of user data	blog post
CVE-2017-11594	Loomio	Stored XSS	Markdown not sanitized. Allows to cast users’ votes using their identity	commit, demo
	Loomio	Stored XSS	No restrictions to attached files (when served locally). Allows to cast users’ votes using their identity	demo
CVE-2017-1000039	Framadate	Formula Injection		issue, MR
CVE-2017-11593	Markdown Preview Plus Chrome’s Extension	Stored XSS	Led its users vulnerable to XSS in a ton of websites, by converting text, markdown and rst files to HTML without sanitization	issue
CVE-2016-6127	Request Tracker (RT)	Stored XSS	Exposed all users data to remote attacker. Concurrent finding.
	TeleR	RCE	3 Arbitrary Code Executions on their server
	Turtl	Stored XSS	3 XSS exposing encrypted data (incl. passwords)
	NCrypt	Stored XSS		issue
	Sympa	Stored XSS		No public record
	GNU Social	Stored XSS		No public record
	Shaarli	Stored XSS	Markdown plugin	MR
	Drupal/webform	Stored XSS	Exposed user and admin data when visiting an uploaded file. Concurrent finding.	issue
	Framaforms	Improper Access Control	Exposed URL of all users forms	No public record
	Framaforms	Stored XSS	Exposed responses of user forms. Too permissive formats allowed to untrusted users	issue
	Framaforms	Stored XSS		issue
	Framaslides	Stored XSS	Markdown not sanitized	commit
	Framaslides	Stored XSS	Escape markdown link sanitization (marked lib not updated)	issue
	Framaslides	Stored XSS		issue
	Framemo & Sandstorm’s Scrumblr	Stored XSS	Markdown not sanitized	issue, PR
	Framemo & Sandstorm’s Scrumblr	Formula Injection		issue, MR
	Framaboard	Stored XSS	Affected Framaboard, but default Kanboard not vulnerable	commit
	Wallabag 2 & Graby	Stored XSS		PR
	Kresus	Stored Self-XSS	Possible to leverage it by importing a malicious JSON	issue
	Dolomon	Stored (Self)-XSS	Multiple XSS. Some can be leveraged using a CSRF issue	issue
	Dolomon	Improper Access Control	Gave access to the URLs saved by all users	issue
	Dolomon	Formula Injection		issue
	Mastodon	Mime-sniffing	Low (only Internet Explorer affected)	No public record
	share-on-diaspora Wordpress Plugin	Reflected Client XSS	Fixed, but not discovered.	PR

Contact

Fell free to contact me

Contact me preferably by email and follow me on Twitter, Bluesky or Mastodon.

martin at gubri dot eu