| CARVIEW |
Select Language
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Sun, 28 Dec 2025 11:17:36 GMT
access-control-allow-origin: *
etag: W/"695111d0-7c7b"
expires: Tue, 30 Dec 2025 02:41:16 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: BB53:2DDCFF:97A6A0:AA6147:69533974
accept-ranges: bytes
age: 0
date: Tue, 30 Dec 2025 02:31:16 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210062-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767061876.257257,VS0,VE340
vary: Accept-Encoding
x-fastly-request-id: 20c6c71100b857b350eb879d371f1ffff7efcdf8
content-length: 6763
Heng Dong | ByteDance
Researcher
ByteDance
drdhxi (at) gmail.com
Heng Dong
About Me
I am currently a researcher at ByteDance Seed in Beijing, where I work on embodied intelligence and multi-modal foundation models for robotics.
I received my PhD from the Institute for Interdisciplinary Information Sciences (IIIS) at Tsinghua University, under the supervision of Prof. Chongjie Zhang and Prof. Yi Wu. This institute is led by Turing Award laureate Prof. Andrew Yao.
The goal of my research is to endow agents with superhuman intelligence, which I believe can be achieved through learning from interactions using modern models. My biography is here.
Research Interests
- Learning from Interactions: RL, Robot Control & Design, Multi-Agent
- Modern Models: LLMs & VLMs, Diffusion Models, Flow Models
Experience
- [Jun. 2025 - Present] Researcher at ByteDance Seed
- [Sep. 2020 - Jun. 2025] Ph.D. in Artificial Intelligence, Tsinghua University
- [Sep. 2016 - Jun. 2020] B.S. in Computer Science, University of Science and Technology of China
News
- [Sep. 2025] We released Robix, a unified robot brain that integrates reasoning, planning, and natural interaction, demonstrating superior performance over GPT-4o and Gemini-2.5-pro.
- [May 2025] Our paper about enhancing the decision-making ability of LLMs is accepted to ICML 2025.
Publications
-
Tech Report 2025Huang Fang*, Mengxi Zhang*, Heng Dong*, Wei Li*, Zixuan Wang, Qifeng Zhang, Xueyun Tian, Yucheng Hu, Hang LiTL;DR Robix is a unified robot brain that integrates reasoning, planning, and human interaction within a single vision-language architecture, outperforming GPT-4o and Gemini-2.5-pro through its novel chain-of-thought reasoning and three-stage training strategy.
-
ICML 2025Heng Dong*, Kefei Duan*, Chongjie ZhangTL;DR We propose a novel LLM-based Actor-Critic framework that enhances LLMs' decision-making through long-term action evaluations and efficient policy improvements.
-
arXiv 2025
Xinyi Yang, Liang Zeng, Heng Dong, Chao Yu, Xiaoran Wu, Huazhong Yang, Yu Wang, Milind Tambe Tonghan WangTL;DR We train LLMs to generate better agent explanations using RL with feedback from generative models, outperforming standard RLHF methods. -
AAMAS 2025TL;DR We show that diffusion models can reconstruct global states in decentralized partially observable multiagent systems, with approximation errors leading to deviations that can be bounded for convergence to the true state.
-
ICLR 2024Heng Dong*, Junyu Zhang*, Chongjie ZhangTL;DR We propose to design multi-cellular robots in a coarse-to-fine manner and leverage hyperbolic embeddings for implementation.
-
ICML 2023TL;DR We exploit the structure of the design space in robot design problems with symmetry characteristics and generate robots with high performance more efficiently.
-
NeurIPS 2022TL;DR A muscle synergy inspired RL framework that groups actuators into synchronised modules for efficient control of high-DoF robots, demonstrating superior efficiency and generalizability on complex morphologies like Humanoids++ and UNIMALs.
-
arXiv 2021TL;DR We identify that fragile cooperation in sequential social dilemmas stems from second-order conflicts in incentive mechanisms, and propose a homophily-based learning framework that achieves stable cooperation in public goods and tragedy of the commons scenarios.
-
ICLR 2021
TL;DR A decomposed policy gradient method that outperforms state-of-the-art multi-agent RL algorithms by enabling efficient off-policy learning and better credit assignment. -
ICML 2020
TL;DR A role-based MARL framework where agents autonomously learn specialized roles, improving performance on StarCraft II micromanagement.
Most of my research is about robotics and agents. Some papers are highlighted.
Services
Conference Reviewer
- [2022 - present]
Annual Conference on Neural Information Processing Systems (NeurIPS) - [2022 - present]
International Conference on Machine Learning (ICML) - [2022 - present]
International Conference on Learning Representations (ICLR) - [2025 - present]
Association for the Advancement of Artificial Intelligence (AAAI)
Teaching Assistant
- [Fall, 2021]
Artificial Intelligence: Principles and Techniques, IIIS, Tsinghua - [Spring, 2022]
Reinforcement Learning, IIIS, Tsinghua
Powered by Jekyll and Minimal Light theme. Also some ideas borrowed from Jon Barron and Junyu Zhang