| CARVIEW |
Select Language
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Thu, 16 Oct 2025 14:22:58 GMT
access-control-allow-origin: *
strict-transport-security: max-age=31556952
etag: W/"68f0ffc2-1946"
expires: Wed, 31 Dec 2025 00:23:50 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: 70A5:272D88:A9BDC3:BEC579:69546ABD
accept-ranges: bytes
age: 0
date: Wed, 31 Dec 2025 00:13:50 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210043-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767140030.151441,VS0,VE218
vary: Accept-Encoding
x-fastly-request-id: d30c3cd072a44e9dd8df07135c1fb6be6df3ab73
content-length: 2152
Michael Hu
Michael Hu
I am a fourth-year PhD student at the NYU Center for Data Science, advised by Kyunghyun Cho and Tal Linzen. I am supported by the NSF Graduate Research Fellowship.
I study how to train and adapt large language models: [Aioli], [pre-pretraining]
I’m also interested in cognitive science and training dynamics.
- Cognitive science: [On human-scale LMs], [BabyLM]
- Training dynamics: [Latent state models], [Visualization]
Previously, I completed a BSE at Princeton CS, where I spent two lovely years working with Karthik Narasimhan and Tom Griffiths. I then joined Yobi AI for two years as the first employee.
In my spare time, I enjoy cooking, running, and playing basketball.
News
| August 2025 | Gave a talk on Aioli and pre-pretraining at Harvard ML Foundations. |
| July 2025 | Pre-pretraining won an Outstanding Paper Award at ACL 2025! 🏅 |
| July 2025 | New preprints: Scaling Laws Are Unreliable for Downstream Tasks and RELIC: Evaluating Compositional Instruction Following via Language Recognition. |
| Spring 2025 | Gave talks on pre-pretraining at École Normale Supérieure CoML, FLaNN, Ryco Lab Reading Group, and CDS Seminar. |
| April 2025 | Presenting Aioli: A Unified Optimization Framework for Language Model Data Mixing and How to visualize training dynamics in neural networks at ICLR 2025. |