CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 last-modified: Mon, 08 Dec 2025 03:45:43 GMT access-control-allow-origin: * strict-transport-security: max-age=31556952 etag: W/"693649e7-29b6" expires: Sun, 28 Dec 2025 23:35:31 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: D0E0:3157C7:804068:90201F:6951BC6A accept-ranges: bytes age: 0 date: Sun, 28 Dec 2025 23:25:31 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210073-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1766964331.950924,VS0,VE219 vary: Accept-Encoding x-fastly-request-id: ce1aa2db5936fb466f08ccb4127eeceb925fa10e content-length: 3238 I like rice pudding

Jerry Huang

Starving Grad Student

About Me

I am a PhD student at Mila.
My research broadly focuses on understanding the principles behind how and why deep learning models work, in particular from a data-driven perspective with a current focus on sequence models. I also dabble in applied natural language processing and AI research.
I am grateful to have been supported by NSERC, FRQNT, Hydro-Québec and 日本学術振興会 during my studies, allowing me to conduct research on topics of personal fulfillment and interest.
I am always on the hunt for research internships or visiting opportunities. Please reach out by e-mail if there is overlap in research interest or if you simply feel I would be a good fit/addition to your team. For an understanding of my background and skills, refer to my (current) representative papers (below) and GitHub (which may be slightly out-of-date). Any unavailable manuscripts or code is available on request.
Additionally, I anticipate graduating in the 2026-2027 academic year and am looking for post-doctoral or research scientist positions; if you see a fit within your group, please do not hesitate to contact me. References are available upon request.

Relevant Works

A list of papers representing my more current research interests is below. For a more complete list, refer to DBLP.

Mamba Modulation: On the Length Generalization of Mamba Models [Link]
Resona: Improving Context Copying in Linear Recurrence Models with Retrieval [Link]
Calibrated Language Models and How to Find Them with Label Smoothing [Link]
ZETA: Leveraging Z-order Curves for Efficient Top-K Attention [Link]
How Well Can a Long Sequence Model Model Long Sequences? [Link]
Context-Aware Assistant Selection for Improved Inference Acceleration with Large Language Models [Link]

Working With Me

I actively seek motivated mentees! If you are passionate about research and eager to work on innovative projects in deep learning, consider reaching out! I am always open to discussing topics you might want to work on and/or providing the support to start something new.
I generally prioritize intermediate-to-advanced knowledge of math (linear algebra, statistics and calculus), research experience, and knowledge of machine learning, in this order. Coding experience and a more advanced math background (differential equations, measure/function theory, optimization, etc.) can vary depending on the possible topics we can work on together.
Nevertheless, everyone has different experiences and I encourage those from a non-conventional background (specifically non-engineering or math based) to contact me if interested.

Contacting Me

I am always reachable through e-mail, with the addresses structured as "[first_name].[last_name]@[domain_name]" for Mila (mila.quebec) and Université de Montréal (umontreal.ca). Though I try to be as responsive as possible, do anticipate potential delays of up to 48 hours in case I am busy.

Design inspired by tetsuok and rooa.

Original Source | Taken Source