HOME
ABOUT
- RESULTS
- differences
- BENEFITS
- HISTORY
- TEAM
- LOCATION
- FACILITIES
- BANKING
- MEMBERSHIPS
- APPROVALS
- LICENCES
- SUPPLIERS
- SPONSORSHIPS
- MEDIA
- PRIVACY
AUCTIONS
SHIPPING
FEES
- TS REWARDS
TOOLS
guides
FAQ
CONTACT
- CONNECT

VEHICLES
BRAND
- JAPANESE CARS
  - DAIHATSU
  - EUNOS
  - FORD
  - HONDA
  - ISUZU
  - LEXUS
  - MAZDA
  - MITSUBISHI
  - MITSUOKA
  - NISSAN
  - SUBARU
  - SUZUKI
  - TOYOTA
- GERMAN CARS
- AMERICAN CARS
- BRITISH CARS
- ITALIAN CARS
- FRENCH CARS
- SWEDISH CARS
- KOREAN CARS
TYPE
- mobility
- VENDING
- instruction
- TAXIS
- AMBULANCES
- FIRE ENGINES
- HEARSES
- LIMOUSINES
- COMMERCIAL
CLASS
FUEL
TRUCKS
minitrucks
- DAIHATSU
- HONDA
- MAZDA
- MITSUBISHI
- NISSAN
- SUBARU
- SUZUKI
- DUMP
- CRANE
- CAMPER
- REFRIGERATED
- 4WD
- NEW
BUSES
MOTORHOMES
- YAHOO!
- RAKUTEN
- DEALER

PARTS
- FREE REPORT
- PARTS CONTAINERS
- PARTS SYSTEMS
- PARTS PROTECTION
- BODY SHELLS
- DISMANTLING
- ONLINE PARTS
- NEW PARTS
- INTERIOR PARTS
- EXTERIOR PARTS
  - BONNETS
  - BUMPERS
  - GRILLES
  - FENDERS
  - DOORS
  - TRUNKS
  - SPOILERS
  - LIGHTS
  - EMBLEMS
  - CAMERAS
- ENGINES
- TRANSMISSIONS
- WHEELS & TYRES
  - WHEELS
  - TYRES
CUTS
PERFORMANCE PARTS
TRUCK PARTS
MOTORBIKE PARTS
- MOTORBIKE ENGINES
- MOTORBIKE ACCESSORIES

MOTORBIKES
MARINE
FORKLIFTS
MACHINERY
AGRICULTURAL
OTHER
COUNTRY
- AUSTRALIA
- CANADA
- KENYA
- MYANMAR
- NEW ZEALAND
- PAKISTAN
- TANZANIA
- UNITED STATES

CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 last-modified: Sat, 06 Dec 2025 20:56:14 GMT access-control-allow-origin: * etag: W/"6934986e-761a" expires: Mon, 29 Dec 2025 18:05:20 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: 2F48:21D6A4:93EB80:A5BF35:6952C087 accept-ranges: bytes age: 0 date: Mon, 29 Dec 2025 17:55:20 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210033-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1767030920.424995,VS0,VE206 vary: Accept-Encoding x-fastly-request-id: e1a87329b22b478d120f1ffc3621270585985639 content-length: 9965 Igor Shilov

home (current)
publications
cv

Igor Shilov

PhD student at Imperial College London working on ML privacy & security. Big Tech dropout.

about

Hi, I’m Igor!

I’m currently a PhD student at the Computational Privacy Group at Imperial College London under the supervision of Yves-Alexandre de Montjoye. Recently, I’ve been a part of the inaugural Anthropic AI Safety Fellowship Program.

My research interests include:

🏛️ LLM Privacy & Security
🏛️ Differential Privacy
🏛️ Model internals

I come from a strong engineering background, having spent over a decade in the industry as a Software Engineer. Most recently, I was at Meta AI as a Research Engineer in Privacy Preserving ML group. During my time there, I led the development of Opacus, an open-source PyTorch library for training models with Differential Privacy that has gained significant community adoption. I also helped design StopNCII.org, a privacy-preserving platform that uses on-device perceptual hashing to combat non-consensual intimate image sharing.

contact

I echo the standing invitation: I like getting email and I like talking to people. Please do reach out if you’re interested in discussing research, collaboration opportunities, or the future of privacy in AI!

I can also be found in Twitter.

selected publications

NeurIPS 2025

Strong Membership Inference Attacks on Massive Datasets and (Moderately) Large Language Models

Jamie Hayes, Ilia Shumailov, Christopher A. Choquette-Choo, Matthew Jagielski, George Kaissis, Katherine Lee, and 10 more authors

arXiv preprint 2505.18773, 2025

Abs Paper

State-of-the-art membership inference attacks (MIAs) typically require training many reference models, making it difficult to scale these attacks to large pre-trained language models (LLMs). As a result, prior research has either relied on weaker attacks that avoid training reference models (e.g., fine-tuning attacks), or on stronger attacks applied to small-scale models and datasets. However, weaker attacks have been shown to be brittle - achieving close-to-arbitrary success - and insights from strong attacks in simplified settings do not translate to today’s LLMs. These challenges have prompted an important question: are the limitations observed in prior work due to attack design choices, or are MIAs fundamentally ineffective on LLMs? We address this question by scaling LiRA - one of the strongest MIAs - to GPT-2 architectures ranging from 10M to 1B parameters, training reference models on over 20B tokens from the C4 dataset. Our results advance the understanding of MIAs on LLMs in three key ways: (1) strong MIAs can succeed on pre-trained LLMs; (2) their effectiveness, however, remains limited (e.g., AUC<0.7) in practical settings; and, (3) the relationship between MIA success and related privacy metrics is not as straightforward as prior work has suggested.
USENIX 2025

Free Record-Level Privacy Risk Evaluation Through Artifact-Based Methods

Joseph Pollock^*, Igor Shilov^*, Euodia Dodd, and Yves-Alexandre Montjoye

In 34th USENIX Security Symposium (USENIX Security 25), 2025

Abs Paper Code Website

Membership inference attacks (MIAs) are widely used to empirically assess the privacy risks of samples used to train a target machine learning model. State-of-the-art methods however require training hundreds of shadow models, with the same size and architecture of the target model, solely to evaluate the privacy risk. While one might be able to afford this for small models, the cost often becomes prohibitive for medium and large models. We here instead propose a novel approach to identify the at-risk samples using only artifacts available during training, with little to no additional computational overhead. Our method analyzes individual per-sample loss traces and uses them to identify the vulnerable data samples. We demonstrate the effectiveness of our artifact-based approach through experiments on the CIFAR10 dataset, showing high precision in identifying vulnerable samples as determined by a SOTA shadow model-based MIA (LiRA). Impressively, our method reaches the same precision as another SOTA MIA when measured against LiRA, despite it being orders of magnitude cheaper. We then show LT-IQR to outperform alternative loss aggregation methods, perform ablation studies on hyperparameters, and validate the robustness of our method to the target metric. Finally, we study the evolution of the vulnerability score distribution throughout training as a metric for model-level risk assessment.
SaTML 2025

SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It)

Matthieu Meeus, Igor Shilov, Shubham Jain, Manuel Faysse, Marek Rei, and Yves-Alexandre Montjoye

In 3rd IEEE Conference on Secure and Trustworthy Machine Learning, 2025

Abs Paper Code

Whether LLMs memorize their training data and what this means, from privacy leakage to detecting copyright violations – has become a rapidly growing area of research over the last two years. In recent months, more than 10 new methods have been proposed to perform Membership Inference Attacks (MIAs) against LLMs. Contrary to traditional MIAs which rely on fixed – but randomized – records or models, these methods are mostly evaluated on datasets collected post-hoc. Sets of members and non-members, used to evaluate the MIA, are constructed using informed guesses after the release of a model. This lack of randomization raises concerns of a distribution shift between members and non-members. In the first part, we review the literature on MIAs against LLMs. While most work focuses on sequence-level MIAs evaluated in post-hoc setups, we show that a range of target models, motivations and units of interest have been considered in the literature. We then quantify distribution shifts present in the 6 datasets used in the literature, ranging from books to papers, using a bag of word classifier. Our analysis reveals that all of them suffer from severe distribution shifts. This challenges the validity of using such setups to measure LLM memorization and may undermine the benchmarking of recently proposed methods. Yet, all hope might not be lost. In the second part, we introduce important considerations to properly evaluate MIAs against LLMs and discuss potential ways forward: randomized test splits, injections of randomized (unique) sequences, randomized finetuning, and post-hoc control methods. While each option comes with its advantages and limitations, we believe they collectively provide solid grounds to guide the development of MIA methods and study LLM memorization. We conclude by proposing comprehensive, easy-to-use benchmarks for sequence- and document-level MIAs against LLMs.
ICML 2024

Copyright Traps for Large Language Models

Matthieu Meeus^*, Igor Shilov^*, Manuel Faysse, and Yves-Alexandre Montjoye

In Forty-first International Conference on Machine Learning, 2024

Press coverage in MIT Technology Review and Nature News.

Abs Paper Code

Questions of fair use of copyright-protected content to train Large Language Models (LLMs) are being very actively debated. Document-level inference has been proposed as a new task: inferring from black-box access to the trained model whether a piece of content has been seen during training. SOTA methods however rely on naturally occurring memorization of (part of) the content. While very effective against models that memorize a lot, we hypothesize–and later confirm–that they will not work against models that do not naturally memorize, e.g. medium-size 1B models. We here propose to use copyright traps, the inclusion of fictitious entries in original content, to detect the use of copyrighted materials in LLMs with a focus on models where memorization does not naturally occur. We carefully design an experimental setup, randomly inserting traps into original content (books) and train a 1.3B LLM. We first validate that the use of content in our target model would be undetectable using existing methods. We then show, contrary to intuition, that even medium-length trap sentences repeated a significant number of times (100) are not detectable using existing methods. However, we show that longer sequences repeated a large number of times can be reliably detected (AUC=0.75) and used as copyright traps. We further improve these results by studying how the number of times a sequence is seen improves detectability, how sequences with higher perplexity tend to be memorized more, and how taking context into account further improves detectability.
NeurIPS Workshop

Opacus: User-Friendly Differential Privacy Library in PyTorch

Ashkan Yousefpour, Igor Shilov, Alexandre Sablayrolles, Davide Testuggine, Karthik Prasad, Mani Malek, and 6 more authors

In NeurIPS Workshop on Privacy in Machine Learning, 2021

Abs Paper Code

We introduce Opacus, a free, open-source PyTorch library for training deep learning models with differential privacy (hosted at opacus.ai). Opacus is designed for simplicity, flexibility, and speed. It provides a simple and user-friendly API, and enables machine learning practitioners to make a training pipeline private by adding as little as two lines to their code. It supports a wide variety of layers, including multi-head attention, convolution, LSTM, GRU (and generic RNN), and embedding, right out of the box and provides the means for supporting other user-defined layers. Opacus computes batched per-sample gradients, providing higher efficiency compared to the traditional "micro batch" approach. In this paper we present Opacus, detail the principles that drove its implementation and unique features, and benchmark it against other frameworks for training models with differential privacy as well as standard PyTorch.

selected projects

Opacus: User-Friendly Diﬀerential Privacy Library in PyTorch

Tech Lead (Meta)

I was the lead developer and maintainer of Opacus, a PyTorch library to train ML models with Differential Privacy. With 1k+ stars on github and 300+ citations, the tool is helping to advance the state-of-the-art in privacy preserving ML both internally and for the wider community of researchers.

Talks: PyTorch Developer Day 2021, UCL Privacy and Security Seminar

Blog Code Website
StopNCII.org: Stop Non-Consensual Intimate Image Abuse

Tech Lead (Meta)

I have lead the team developing a privacy-preserving platform helping combat non-consensual intimate image sharing, a joint effort between Meta and a UK-based NGO running “Revenge Porn Helpline”. The platform takes advantage of on-device perceptual hashing to protect privacy. Since it’s launch, many major social media platforms has signed up as industry partners, inclusing Reddit, Snap, TilTok, OnlyFans and PornHub.

Selected press: NBC News, Bloomberg, Mashable, Cosmopolitan

Blog Website

🇺🇦 Support Ukraine 🇺🇦

HOME
ABOUT
AUCTIONS
SHIPPING
FEES
TOOLS
HOW
FAQ
CONTACT

Original Source | Taken Source