Welcome! I’m Taco Cohen, AI research scientist at Meta / FAIR, interested in RL, LLMs, generative models, and equivariant networks.
Here is my CV and Google Scholar page.
Bio
Taco Cohen is an AI Research Scientist at Meta. He received a BSc in computer science from Utrecht University, and a MSc in AI and PhD in ML (with prof. Max Welling) from the University of Amsterdam (all three cum laude). During his PhD he developed and popularized equivariant networks and geometric deep learning. He was a co-founder of Scyfer, a company focussed on deep active learning, acquired by Qualcomm in 2017. At Qualcomm, he founded and led the generative models and data compression team. His current research is focused on code generation and reinforcement learning. During his studies he has interned at Google Deepmind (working with Geoff Hinton) and OpenAI. He received the 2014 University of Amsterdam MSc thesis prize, a Google PhD Fellowship, ICLR 2018 best paper award for “Spherical CNNs”, was named one of 35 innovators under 35 by MIT Tech Review, and won the 2022 ELLIS PhD Award and 2022 Kees Schouhamer Immink prize for his PhD research.
Research
Equivariant Networks & Geometric Deep Learning
During my PhD I developed the first group equivariant convolutional networks (G-CNNs) and several generalizations such as Steerable CNNs, Spherical CNNs, and Gauge Equivariant CNNs on manifolds, as well as a general theory of equivariant convolutions. Equivariant networks leverage knowledge of the symmetries of a learning problem to improve data efficiency, and have been shown to enjoy better scaling behavior than non-equivariant methods in problems with symmetries. Symmetries can be found in many applications of machine learning, such as medical imaging, drug design, materials design, analysis of protein structure, global climate modeling, lattice gauge theory, cryo-electron microscopy, graph-structured data, combinatorial optimization, robotics, and many others. In recent years, the field of equivariant networks (also known as geometric deep learning) has blossomed into a very active area of research, with many researchers developing equivariant networks for use in these domains.
In my work I have tried to be systematic, developing a general class of network architectures, a theory, and an approach to modeling rather than a specific network architecture. The general theory of equivariant convolutional networks, described in Part II of my PhD thesis, covers a large class of neural network layers, which can be categorized by the base space on which the data live (e.g. images on the plane, sphere, a graph, or other spaces), the kind of geometric quantity attached to each point in the base space (e.g. scalars, vectors, tensors), and the group of symmetries (permutation, translation, rotation, scaling, etc.). By varying these, one obtains many of the methods that can be found in the literature on equivariant networks. The “convolution is all you need” theorem states that the most general (i.e. least restricted) equivariant linear map between feature spaces in this class is a generalized convolution.
Together with Michael Bronstein, Joan Bruna and Petar Veličković, we are writing a book on geometric deep learning that aims to make equivariant networks accessible to a general machine learning audience. An early draft and lecture course can be found here.
Generative models & data compression
In 2019 I started the generative models & data compression team at Qualcomm, with the goal to apply the spectacular advances in deep generative modeling to the problem of image, video and speech compression. Generative models hold great potential for compression, not only because better probability models correspond to more efficient codes, but also because generative models have the unique capacity to generate details that are not coded in the bitstream. So whereas a classical video codec will produce artifacts such as blocking at low bitrates, a generative compression system can generate realistic looking textures. Together with the semantic understanding afforded by deep networks, this allows for fine-grained control over which areas to encode with high fidelity, and which ones to generate on the receiver side.
We published one of the first neural video codecs, introduced various novel concepts such as instance-adaptive compression, feedback-recurrent autoencoders for speech and video coding, and the first transformer-based neural video codec. After several years of dedicated effort, our neural speech and video compression methods now significantly outperform classical codecs and run efficiently on low-power devices. Next-generation video coding standards will likely incorporate some of these advances.
RL & LLMs
I’m currently working on RL with LLMs for codegen and mathematics.
Links
Twitter: @TacoCohen
Google Scholar: link
Geometric Deep Learning Book & Video Lectures: link
NeurIPS Tutorial on Equivariant Networks with Risi Kondor: link
awesome-equivariant-network: overview of papers & videos on equivariant networks
Article in Quanta / Wired (2020)
TWIML Podcast on Natural Graph Networks (2020)
Publications
Selected Publications on Equivariant Networks
J. Brehmer, S. Behrends, P. De Haan, T. Cohen, Does equivariance matter at scale?, TMLR 2025
[ArXiv]
T.S. Cohen, Equivariant Convolutional Networks, PhD Thesis, University of Amsterdam, 2021
[pdf, defence ceremony] (Note: Part II contains a lot of new material, not published before)
Michael M. Bronstein, Joan Bruna, Taco Cohen, Petar Veličković, Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges, 2021
[ArXiv, website, youtube lectures]
P. de Haan, T. Cohen, M. Welling, Natural Graph Networks, NeurIPS 2020
[ArXiv]
T.S. Cohen, M. Weiler, B. Kicanaoglu, M. Welling, Gauge Equivariant Convolutional Networks and the Icosahedral CNN, Proceedings of the International Conference on Machine Learning (ICML), 2019
[ArXiv]
T.S. Cohen, M. Geiger, M. Weiler, A General Theory of Equivariant CNNs on Homogeneous Spaces, NeurIPS 2019
[ArXiv]
M. Winkels, T.S. Cohen, 3D G-CNNs for Pulmonary Nodule Detection. International Conference on Medical Imaging with Deep Learning (MIDL), 2018.
[ArXiv]
T.S. Cohen, M. Geiger, J. Koehler, M. Welling, Spherical CNNs. ICLR 2018 (Best paper award).
[pdf] [code]
T.S. Cohen, M. Welling, Steerable CNNs. International Conference on Learning Representations (ICLR), 2017
[pdf]
T.S. Cohen, M. Welling, Group Equivariant Convolutional Networks. Proceedings of the International Conference on Machine Learning (ICML), 2016
[pdf] [supp. mat.] [code for experiments] [G-Conv code]
All Publications
B. M. Wood, M. Dzamba, X. Fu, M. Gao, M. Shuaibi, L. Barroso-Luque, K. Abdelmaqsoud, V. Gharakhanyan, J. R. Kitchin, D. S. Levine, K. Michel, A. Sriram, T. Cohen, A. Das, A. Rizvi, S. J. Sahoo, Z. W. Ulissi, C. L. Zitnick, UMA: A Family of Universal Models for Atoms, 2025
[ArXiv]
Ori Yoran, Kunhao Zheng, Fabian Gloeckle, Jonas Gehring, Gabriel Synnaeve, Taco Cohen, The koLMogorov test: Compression by code generation, ICLR 2025
[ArXiv]
Yunhao Tang, Taco Cohen, David W Zhang, Michal Valko, Rémi Munos, RL-finetuning LLMs from on-and off-policy data with a single algorithm, 2025
[ArXiv]
T. Cohen, D. W. Zhang, K. Zheng, Y. Tang, R. Munos, G. Synnaeve, Soft Policy Optimization: Online Off-Policy RL for Sequence Models, 2025
[ArXiv]
J. Brehmer, S. Behrends, P. De Haan, T. Cohen, Does equivariance matter at scale?, TMLR 2025
[ArXiv]
K. Zheng, J. Decugis, J. Gehring, T. Cohen, B. Negrevergne, G. Synnaeve, What Makes Large Language Models Reason in (Multi-Turn) Code Generation?, ICLR 2025
[ArXiv]
J. Gehring, K. Zheng, J. Copet, V. Mella, Q. Carbonneaux, T. Cohen, G. Synnaeve, RLEF: Grounding code llms in execution feedback with reinforcement learning, ICML 2025
[ArXiv]
R. Vuorio, P. De Haan, J. Brehmer, H. Ackermann, D. Dijkman, T. Cohen, Deconfounding Imitation Learning with Variational Inference, TMLR 2024
[ArXiv]
P. Mazzaglia, T. Cohen, D. Dijkman, Information-driven Affordance Discovery for Efficient Robotic Manipulation, ICRA 2024
[ArXiv, project page]
M. Defferrard, C. Rainone, D. W. Zhang, B. Manczak, N. Butt, T. Cohen, Towards Self-Improving Language Models for Code Generation, LLMAgents workshop @ ICLR 2024
[coming soon]
N. Butt, B. Manczak, A. Wiggers, C. Rainone, D. Zhang, M. Defferrard, T. Cohen, CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay, ICML 2024
(also shared at the AGI Workshop @ ICLR 2024)
[ArXiv]
P. De Haan, T. Cohen, J. Brehmer, Euclidean, Projective, Conformal: Choosing a Geometric Algebra for Equivariant Transformers, NeurReps workshop @ NeurIPS 2023 & AISTATS 2024
[ArXiv]
J. Brehmer, P. De Haan, S. Behrends, T. Cohen, Geometric Algebra Transformers, NeurIPS 2023
[ArXiv] [code]
A. Duval, S. V. Mathis, C. K. Joshi, V. Schmidt, S. Miret, F. D. Malliaros, T. Cohen, P. Liò, Y. Bengio, M. Bronstein, A Hitchhiker’s Guide to Geometric GNNs for 3D Atomic Systems, 2023
[ArXiv]
P. Lippe, S. Magliacane, S. Löwe, Y. M. Asano, T. Cohen, E. Gavves, BISCUIT: Causal Representation Learning from Binary Interactions, UAI 2023
[pdf]
J. Brehmer*, J. Bose*, P. De Haan, T. Cohen, EDGI: Equivariant diffusion for planning with embodied agents, Reincarnating RL workshop @ ICLR 2023
[ArXiv]
C. K. Joshi, C. Bodnar, S. V. Mathis, T. Cohen, P. Liò, On the Expressive Power of Geometric Graph Neural Networks, ICML 2023
[ArXiv]
T. van Rozendaal, J. Brehmer, Y. Zhang, R. Pourreza, T.S. Cohen, Instance-Adaptive Video Compression: Improving Neural Codecs by Training on the Test Set, TMLR 2023
[ArXiv]
N. Butt, A. Wiggers, T. Cohen, M. Welling, Program Synthesis for Integer Sequence Generation, MATH-AI Workshop @ NeurIPS 2022
[pdf]
R. Vuorio, J. Brehmer, H. Ackermann, D. Dijkman, T. Cohen, P. de Haan, Deconfounded Imitation Learning, Deep RL Workshop @ NeurIPS 2022
[ArXiv]
J. Brehmer, P. De Haan, P. Lippe, T. Cohen, Weakly supervised causal representation learning, NeurIPS 2022
Earlier versions published at the OSC Workshop @ ICLR 2022 and Causal Representation Learning Workshop @ UAI 2022
[ArXiv]
G. Cesa, A. Behboodi, T.S. Cohen, M. Welling, On the symmetries of the synchronization problem in Cryo-EM: Multi-Frequency Vector Diffusion Maps on the Projective Plane, NeurIPS 2022
[soon]
A. Behboodi, G. Cesa, T.S. Cohen, A PAC-Bayesian Generalization Bound for Equivariant Networks, NeurIPS 2022
[ArXiv]
T.S. Cohen, Towards a Grounded Theory of Causation for Embodied AI, Causal Representation Learning Workshop @ UAI 2022
[ArXiv]
P. Lippe, S. Magliacane, S. Löwe, Y. M. Asano, T. Cohen, E. Gavves, Intervention Design for Causal Representation Learning, Causal Representation Learning Workshop @ UAI 2022
[pdf]
P. Lippe, S. Magliacane, S. Löwe, Y. M. Asano, T. Cohen, E. Gavves, iCITRIS: Causal Representation Learning for Instantaneous Temporal Effects, Causal Representation Learning Workshop @ UAI 2022
[ArXiv]
Y. Perugachi-Diaz, G. Sautière, D. Abati, Y. Yang, A. Habibian, T. Cohen, Region-of-Interest Based Neural Video Compression, BMVC 2022
[ArXiv]
S. Basu, J. Gallego-Posada, F. Viganò, J. Rowbottom, T. Cohen, Equivariant Mesh Attention Networks, TMLR 2022
[ArXiv]
P. Lippe, S. Magliacane, S. Löwe, Y. M Asano, T. Cohen, Efstratios Gavves, CITRIS: Causal Identifiability from Temporal Intervened Sequences, ICML 2022
Earlier version published at OSC Workshop @ ICLR, 2022
[ArXiv]
Y. Zhang, T. van Rozendaal, J. Brehmer, M. Nagel, T. Cohen, Implicit Neural Video Compression, DGM4HSD Workshop @ ICLR, 2022
[ArXiv]
P. Lippe, T. Cohen, E. Gavves, Efficient Neural Causal Discovery without Acyclicity Constraints, ICLR 2022
[ArXiv]
Y. Zhu, Y. Yang, T. Cohen, Transformer-based transform coding, ICLR 2022
[pdf]
T.S. Cohen, Equivariant Convolutional Networks, PhD Thesis, University of Amsterdam, 2021
[pdf, defence ceremony] (Note: Part II contains a lot of new material, not published before)
Michael M. Bronstein, Joan Bruna, Taco Cohen, Petar Veličković, Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges, 2021
[ArXiv, website, youtube lectures]
A. K. Singh, H. E. Egilmez, R. Pourreza, M. Coban, M. Karczewicz, T. S. Cohen, A Combined Deep Learning based End-to-End Video Coding Architecture for YUV Color Space
[ArXiv]
H. E. Egilmez, A. K. Singh, M. Coban, M. Karczewicz, Y. Zhu, Y. Yang, A. Said, T. S. Cohen, Transform Network Architectures for Deep Learning based End-to-End Image/Video Coding in Subsampled Color Spaces, IEEE Open Journal of Signal Processing, 2021.
[ArXiv]
R. Pourreza, T. Cohen, Extending Neural P-frame Codecs for B-frame Coding, ICCV 2021.
[ArXiv]
A. Habibian, D. Abati, T. Cohen, B. Ehteshami Bejnordi, Skip-Convolutions for Efficient Video Processing, CVPR 2021.
[ArXiv]
Y. Lu, Y. Zhu, Y. Yang, A. Said, T. Cohen, Progressive Neural Image Compression with Nested Quantization and Latent Ordering, ICIP 2021
[ArXiv]
T. van Rozendaal*, I.A.M. Huijben*, T. Cohen, Overfitting for Fun and Profit: Instance-Adaptive Data Compression, ICLR 2021
(*equal contribution)
[ArXiv]
P. de Haan, M. Weiler, T. Cohen, M. Welling, Gauge Equivariant Mesh CNNs: Anisotropic convolutions on geometric graphs, ICLR 2021 (spotlight)
[ArXiv]
E. Hoogeboom, T.S. Cohen, J.M. Tomczak, Learning Discrete Distributions by Dequantization, 3rd AABI Symposium 2021
[ArXiv]
P. de Haan, T. Cohen, M. Welling, Natural Graph Networks, NeurIPS 2020
[ArXiv]
D. Kianfar, A. Wiggers, A. Said, R. Pourreza, T. Cohen, Parallelized Rate-Distortion Optimized Quantization using Deep Learning, IEEE MMSP 2020
[ArXiv]
A. Pervez, T. Cohen, E. Gavves, Low Bias Low Variance Gradient Estimates for Hierarchical Boolean Stochastic Networks, ICML 2020
[pdf]
A. Golinski*, R. Pourreza*, Y. Yang*, G. Sautiere, T. Cohen, Feedback Recurrent Autoencoder for Video Compression, ACCV 2020
(*equal contribution)
[ArXiv]
T. van Rozendaal, G Sautiere, T.S. Cohen, Lossy Compression with Distortion Constrained Optimization, Workshop and Challenge on Learned Image Compression (CLIC) at CVPR 2020.
[ArXiv]
V. Veerabadran, R. Pourreza, A. Habibian, T. Cohen, Adversarial Distortion for Learned Video Compression, Workshop and Challenge on Learned Image Compression (CLIC) at CVPR 2020.
[ArXiv]
M. Mohamed, G. Cesa, T.S. Cohen, M. Welling, A Data and Compute Efficient Design for Limited-Resources Deep Learning, Practical Machine Learning for Developing Countries Workshop (ICLR), 2020
[ArXiv]
Y. Yang, G. Sautière, J. Jon Ryu, T.S. Cohen, Feedback Recurrent AutoEncoder, International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
[ArXiv]
A. Habibian, T. van Rozendaal, J. Tomczak, T.S. Cohen, Video Compression with Rate-Distortion Autoencoders, International Conference on Computer Vision (ICCV), 2019
[ArXiv]
Miranda C.N. Cheng, Vassilis Anagiannis, Maurice Weiler, Pim de Haan, Taco S. Cohen, Max Welling, Covariance in Physics and Convolutional Networks, Theoretical Physics for Deep Learning Workshop @ ICML, 2019
[ArXiv]
T.S. Cohen, M. Weiler, B. Kicanaoglu, M. Welling, Gauge Equivariant Convolutional Networks and the Icosahedral CNN, Proceedings of the International Conference on Machine Learning (ICML), 2019
[ArXiv]
T.S. Cohen, M. Geiger, M. Weiler, A General Theory of Equivariant CNNs on Homogeneous Spaces, NeurIPS 2019
[ArXiv]
M. Weiler, M. Geiger, M. Welling, W. Boomsma, T.S. Cohen, 3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data, Advances in Neural Information Processing Systems (NeurIPS), 2018
[ArXiv] [code] [video]
L. Falorsi, P. de Haan, T. R. Davidson, N. De Cao, M. Weiler, P. Forré and T. S. Cohen, Explorations in Homeomorphic Variational Auto-Encoding, ICML Workshop on Theoretical Foundations and Applications of Generative Models, 2018
[ArXiv] [code]
T.S. Cohen, M. Geiger, M. Weiler, Intertwiners between Induced Representations (with applications to the theory of equivariant neural networks), ArXiv preprint 1803.10743, 2018.
[ArXiv]
M. Winkels, T.S. Cohen, Pulmonary Nodule Detection in CT Scans with Equivariant CNNs, Medical Image Analysis, 2018.
[Link]
M. Winkels, T.S. Cohen, 3D G-CNNs for Pulmonary Nodule Detection. International Conference on Medical Imaging with Deep Learning (MIDL), 2018.
[ArXiv]
M. Winkels, T.S. Cohen, 3D Group-Equivariant Neural Networks for Octahedral and Square Prism Symmetry Groups, FAIM/ICML Workshop on Towards learning with limited labels: Equivariance, Invariance, and Beyond, 2018.
B.S. Veeling, J. Linmans, J. Winkens, T.S. Cohen, M. Welling, Rotation Equivariant CNNs for Digital Pathology. International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2018.
[ArXiv] [PCam repo] [Code]
J. Winkens, J. Linmans, B.S. Veeling, T.S. Cohen, M. Welling, Improved Semantic Segmentation for Histopathology using Rotation Equivariant Convolutional Networks. International Conference on Medical Imaging with Deep Learning (MIDL workshop), 2018.
[pdf]
J. Linmans, J. Winkens, B.S. Veeling, T.S. Cohen, M. Welling, Sample Efficient Semantic Segmentation using Rotation Equivariant Convolutional Networks, FAIM/ICML Workshop on Towards learning with limited labels: Equivariance, Invariance, and Beyond, 2018.
T.S. Cohen, M. Geiger, J. Koehler, M. Welling, Spherical CNNs. ICLR 2018 (Best paper award).
[pdf] [code]
E. Hoogeboom, J.W.T. Peters, T.S. Cohen, M. Welling, HexaConv. ICLR 2018.
[pdf]
T.S. Cohen, M. Geiger, J. Koehler, M. Welling, Convolutional Networks for Spherical Signals. In Principled Approaches to Deep Learning Workshop ICML 2017.
[ArXiv]
A. Eck, L.M. Zintgraf, E.F.J. de Groot, T.G.J. de Meij, T.S. Cohen, P.H.M. Savelkoul, M. Welling, A.E. Budding, Interpretation of microbiota-based diagnostics by explaining individual classifier decisions, BMC Bioinformatics, 2017.
[pubmed]
T. Matiisen, A. Oliver, T.S. Cohen, J. Schulman, Teacher-Student Curriculum Learning. IEEE Transactions on Neural Networks and Learning Systems, 2019. (An earlier version was presented at the Deep Reinforcement Learning Symposium, NIPS 2017)
[ArXiv]
T.S. Cohen, M. Welling, Steerable CNNs. International Conference on Learning Representations (ICLR), 2017
[pdf]
L.M. Zintgraf, T.S. Cohen, T. Adel, M. Welling, Visualizing Deep Neural Network Decisions: Prediction Difference Analysis. International Conference on Learning Representations (ICLR), 2017
[pdf]
T. Adel, T.S. Cohen, M. Caan, M. Welling, 3D Scattering Transforms for Disease Classification in Neuroimaging. Neuroimage: clinical, 2017.
[pubmed]
T.S. Cohen, M. Welling, Group Equivariant Convolutional Networks. Proceedings of the International Conference on Machine Learning (ICML), 2016
[pdf] [supp. mat.] [code for experiments] [G-Conv code]
L.M. Zintgraf, T.S. Cohen, M. Welling, A New Method to Visualize Deep Neural Networks. ArXiv preprint 1603.02518, 2016
[ArXiv]
T.S. Cohen, M. Welling, Harmonic Exponential Families on Manifolds. Proceedings of the International Conference on Machine Learning (ICML), 2015
[pdf] [supp. mat.]
T.S. Cohen, M. Welling, Transformation Properties of Learned Visual Representations. International Conference on Learning Representations (ICLR), 2015.
[ArXiv]
T.S. Cohen, M. Welling, Learning the Irreducible Representations of Commutative Lie Groups. Proceedings of the International Conference on Machine Learning (ICML), 2014.
[pdf] [supp. mat.]
T.S. Cohen, Learning Transformation Groups and their Invariants. Master’s thesis, University of Amsterdam, 2013. (1st place University of Amsterdam thesis prize 2014)
[pdf]





