| CARVIEW |
Select Language
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Mon, 08 Dec 2025 03:45:43 GMT
access-control-allow-origin: *
strict-transport-security: max-age=31556952
etag: W/"693649e7-29b6"
expires: Sun, 28 Dec 2025 23:35:31 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: D0E0:3157C7:804068:90201F:6951BC6A
accept-ranges: bytes
age: 0
date: Sun, 28 Dec 2025 23:25:31 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210073-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1766964331.950924,VS0,VE219
vary: Accept-Encoding
x-fastly-request-id: ce1aa2db5936fb466f08ccb4127eeceb925fa10e
content-length: 3238
I like rice pudding
Jerry Huang
Starving Grad Student
About Me
- I am a PhD student at Mila.
- My research broadly focuses on understanding the principles behind how and why deep learning models work, in particular from a data-driven perspective with a current focus on sequence models. I also dabble in applied natural language processing and AI research.
- I am grateful to have been supported by NSERC, FRQNT, Hydro-Québec and 日本学術振興会 during my studies, allowing me to conduct research on topics of personal fulfillment and interest.
- I am always on the hunt for research internships or visiting opportunities. Please reach out by e-mail if there is overlap in research interest or if you simply feel I would be a good fit/addition to your team. For an understanding of my background and skills, refer to my (current) representative papers (below) and GitHub (which may be slightly out-of-date). Any unavailable manuscripts or code is available on request.
- Additionally, I anticipate graduating in the 2026-2027 academic year and am looking for post-doctoral or research scientist positions; if you see a fit within your group, please do not hesitate to contact me. References are available upon request.
Relevant Works
A list of papers representing my more current research interests is below. For a more complete list, refer to DBLP.
- Mamba Modulation: On the Length Generalization of Mamba Models [Link]
- Resona: Improving Context Copying in Linear Recurrence Models with Retrieval [Link]
- Calibrated Language Models and How to Find Them with Label Smoothing [Link]
- ZETA: Leveraging Z-order Curves for Efficient Top-K Attention [Link]
- How Well Can a Long Sequence Model Model Long Sequences? [Link]
- Context-Aware Assistant Selection for Improved Inference Acceleration with Large Language Models [Link]
Working With Me
- I actively seek motivated mentees! If you are passionate about research and eager to work on innovative projects in deep learning, consider reaching out! I am always open to discussing topics you might want to work on and/or providing the support to start something new.
- I generally prioritize intermediate-to-advanced knowledge of math (linear algebra, statistics and calculus), research experience, and knowledge of machine learning, in this order. Coding experience and a more advanced math background (differential equations, measure/function theory, optimization, etc.) can vary depending on the possible topics we can work on together.
- Nevertheless, everyone has different experiences and I encourage those from a non-conventional background (specifically non-engineering or math based) to contact me if interested.
Contacting Me
- I am always reachable through e-mail, with the addresses structured as "[first_name].[last_name]@[domain_name]" for Mila (mila.quebec) and Université de Montréal (umontreal.ca). Though I try to be as responsive as possible, do anticipate potential delays of up to 48 hours in case I am busy.