| CARVIEW |
Select Language
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Mon, 01 Dec 2025 04:47:15 GMT
access-control-allow-origin: *
etag: W/"692d1dd3-205f"
expires: Wed, 31 Dec 2025 04:36:25 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: EE95:3FD64F:AC333B:C1B143:6954A5F0
accept-ranges: bytes
age: 0
date: Wed, 31 Dec 2025 04:26:25 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210044-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767155185.254072,VS0,VE207
vary: Accept-Encoding
x-fastly-request-id: dfb875ee478a06ff787c2e8945919a6720680a9c
content-length: 2584
Binxuan Huang
Binxuan Huang
Principal Scientist at Amazon | Ph.D. in Computer Science
I am a Principal Scientist at Amazon, working on large language model pretraining for Amazon Rufus shopping assistant. Prior to joining Amazon, I received my Ph.D. in Computer Science at Carnegie Mellon University, advised by Prof. Kathleen M. Carley. My doctoral work focused on social network analysis and data mining.
Education
Carnegie Mellon University
Ph.D. in Computer Science | 2020
Zhejiang University
B.S. in Physics, B.E. in Computer Science | 2015
Work Experience
Principal Scientist, Amazon, June 2020 - Present
Research Intern, Amazon, May 2019 - Aug 2019
Research Assistant, Carnegie Mellon University, Aug 2015 - May 2020
Recent Publications
-
DocTalk: Scalable Graph-based Dialogue Synthesis for Enhancing LLM Conversational Capabilities
Jing Yang Lee, Hamed Bonab, Nasser Zalmout, Ming Zeng, Sanket Lokegaonkar, Colin Lockard, Binxuan Huang, Ritesh Sarkhel, Haodong Wang
SIGDIAL, 2025 -
Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training
Yuchen Zhuang, Jingfeng Yang, Haoming Jiang, Xin Liu, Kewei Cheng, Sanket Lokegaonkar, Yifan Gao, Qing Ping, Tianyi Liu, Binxuan Huang, Zheng Li, Zhengyang Wang, Pei Chen, Ruijie Wang, Rongzhi Zhang, Nasser Zalmout, Priyanka Nigam, Bing Yin, Chao Zhang
NAACL, 2025 -
Scaling laws for predicting downstream performance in LLMs
Yangyi Chen, Binxuan Huang, Yifan Gao, Zhengyang Wang, Jingfeng Yang, Heng Ji
TMLR 2024 -
Concept2Box: Joint Geometric Embeddings for Learning Two-View Knowledge Graphs
Zijie Huang, Daheng Wang, Binxuan Huang, Chenwei Zhang, Jingbo Shang, Yan Liang, Zhengyang Wang, Xian Li, Christos Faloutsos, Yizhou Sun, Wei Wang
ACL 2023 -
Tab-cleaner: Weakly supervised tabular data cleaning via pre-training for E-commerce catalog
Cheng, Kewei; Li, Xian; Wang, Zhengyang; Zhang, Chenwei; Huang, Binxuan; Xu, Yifan Ethan; Dong, Xin Luna; Sun, Yizhou
ACL 2023