I am a PhD student at Gatsby Computational Neuroscience Unit UCL, advised by Peter Latham and Andrew Saxe. I am interested in how neural networks with various architectures learn. Previously, I worked on 3D computer vision at DJI R&D and CUHK CSE; and did undergrad at HUST.
Paper
Saddle-to-Saddle Dynamics Explains A Simplicity Bias Across Neural Network Architectures
Yedi Zhang, Andrew Saxe, Peter E. Latham
Preprint 2025 webpage | arxiv
Training Dynamics of In-Context Learning in Linear Attention
Yedi Zhang, Aaditya K. Singh, Peter E. Latham*, Andrew Saxe*
ICML 2025 (Spotlight) pmlr | arxiv | openreview | code |
talk
When Are Bias-Free ReLU Networks Effectively Linear Networks?
Yedi Zhang, Andrew Saxe, Peter E. Latham
TMLR 2025 tmlr | arxiv
Understanding Unimodal Bias in Multimodal Deep Linear Networks
Yedi Zhang, Peter E. Latham, Andrew Saxe
ICML 2024 webpage | pmlr | arxiv | code