This GitHub repository contains the code for the reproducible experiments presented in our paper Differentially Private Permutation Tests: Applications to Kernel Methods.
The code is written in JAX which can leverage the architecture of GPUs to provide considerable computational speedups.
python 3.9
The packages in env_cpu.yml/env_gpu.yml are required to run our tests and the ones we compare against.
The instructions to install only the dependencies required to run dpMMD and dpHSIC are available on the dpkernel repository.
In a chosen directory, clone the repository and change to its directory by executing
git clone git@github.com:antoninschrab/dpkernel-paper.git
cd dpkernel-paperWe then recommend creating a conda environment with the required dependencies:
- for GPU:
conda env create -f env_gpu.yml conda activate dpkernel-env # can be deactivated by running: # conda deactivate
- or, for CPU:
conda env create -f env_cpu.yml conda activate dpkernel-env # can be deactivated by running: # conda deactivate
The img_align_celeba.zip file containing the CelebA images can either, be downloaded manually from the official website, or be downloaded by executing
wget https://cseweb.ucsd.edu/\~weijian/static/datasets/celeba/img_align_celeba.zipThe file then needs to be unzipped in the main dpkernel-paper directory by running
unzip img_align_celeba.zipThe resulting img_align_celeba directory then contains all the CelebA images.
The list of attributes for the CelebA images is already provided as the list_attr_celeba.txt file.
The code to reproduce the experiments of the paper can be found in the following notebooks:
- experiments_perturbations.ipynb: perturbed uniform experiment,
- experiments_celeba.ipynb: CelebA experiment.
These rely on the samplers sampler_perturbations.py and sampler_celeba.py, and can, for example, be opened through Jupyter Lab by running
jupyter labThe results of the experiments are saved in the results directory. Running the code of the figures.ipynb notebook generates the figures and saves them in the figures directory.
Our proposed dpMMD and dpHSIC tests are implemented in dpkernel.py.
To use our tests in practice, we recommend using our dpkernel package which is available on the dpkernel repository.
It can be installed by running
pip install git+https://github.com/antoninschrab/dpkernel.gitInstallation instructions and example code are available on the dpkernel repository.
We also illustrate how to use the dpMMD and dpHSIC tests in the demo.ipynb notebook.
We implement two general methods for privatising the non-private MMD and HSIC tests, which we refer to as TOT (tot.py) and SARRM (sarrm.py):
- TOT: The Test of Tests: A Framework For Differentially Private Hypothesis Testing, Zeki Kazan, Kaiyan Shi, Adam Groce, Andrew Bray.
- SARRM: Differentially Private Hypothesis Testing with the Subsampled and Aggregated Randomized Response Mechanism, Victor Pena, Andres F. Barrientos.
We also compare dpMMD to the test of A Differentially Private Kernel Two-Sample Test, Anant Raj, Ho Chung Leon Law, Dino Sejdinovic, Mijung Park, using their implementation available on the private_tst repository corresponding to the cloned directory private_me.
If you have any issues running our code, please do not hesitate to contact Antonin Schrab.
Centre for Artificial Intelligence, Department of Computer Science, University College London
Gatsby Computational Neuroscience Unit, University College London
Inria London
@unpublished{kim2023differentially,
title={Differentially Private Permutation Tests: {A}pplications to Kernel Methods},
author={Ilmun Kim and Antonin Schrab},
year={2023},
url = {https://arxiv.org/abs/2310.19043},
eprint={2310.19043},
archivePrefix={arXiv},
primaryClass={math.ST}
}
MIT License (see LICENSE.md).