| CARVIEW |
Yang Liu (刘洋)
Principal Researcher, MicrosoftAbout Me
I am a Principal Researcher at Microsoft GenAI working on Large Language Models. I received my PhD degree from University of Edinburgh in 2020, under supvervision of Prof. Mirella Lapata.
News
- [2025.03] 🚩We present KodCode, the largest verified synthetic coding dataset for Code LLM training
- [2024.07] 🚩We present Samba, a powerful hybrid LLM.
- [2024.05] 🚩I build GPT-4 Japanese.
- [2023.03] We proposed G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment.
- [2022.11] We release UniSumm, a SOTA few-shot summarization model.
- [2022.10] Five papers accepted by EMNLP 2022 and its findings.
- [2022.03] Two papers accepted by ACL 2022 and its findings.
- [2022.01] Give a talk on Dialogue Summarization at Amazon. (Slides)
- [2021.03] Two Long papers and one short paper accepted by NAACL 2021.
- [2021.01] Our RE-T5 model achieved the first place in CommonGen competition.
- [2020.10] Achieved the first place in FEVER competition.
Selected Publications
-
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding
Zhangchen Xu, Yang Liu, Yueqin Yin, Mingyuan Zhou, Radha Poovendran -
Efficiently Editing Mixture-of-Experts Models with Compressed Experts
Yifei He, Yang Liu, Chen Liang, Hany Hassan Awadalla -
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren, Yang Liu, Yadong Lu, Yelong Shen, Chen Liang, Weizhu Chen -
Sparse Modular Activation for Efficient Sequence Modeling
Liliang Ren, Yang Liu, Shuohang Wang, Yichong Xu, Chenguang Zhu, ChengXiang Zhai Neurips 2023 -
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Yang Liu, Dan Iter, Yichong Xu, Shuohang Wang, Ruochen Xu, Chenguang Zhu EMNLP 2023 -
Task Compass: Scaling Multi-task Pre-training with Task Prefix
Zhuosheng Zhang, Shuohang Wang, Yichong Xu, Yuwei Fang, Wenhao Yu, Yang Liu, Hai Zhao, Chenguang Zhu, Michael Zeng. EMNLP 2022 findings -
Towards a Unified Multi-Dimensional Evaluator for Text Generation
Ming Zhong, Yang Liu, Da Yin, Yuning Mao, Yizhu Jiao, Pengfei Liu, Chenguang Zhu, Heng Ji, Jiawei Han. EMNLP 2022 -
AdaPrompt: Adaptive Model Training for Prompt-based NLP
Yulong Chen, Yang Liu, Li Dong, Shuohang Wang, Chenguang Zhu, Michael Zeng and Yue Zhang. EMNLP 2022 findings. -
Unsupervised Summarization with Customized Granularities
Ming Zhong, Yang Liu, Suyu Ge, Yuning Mao, Yizhu Jiao, Xingxing Zhang, Yichong Xu, Chenguang Zhu, Michael Zeng, Jiawei Han. EMNLP 2022 findings. -
Training Data is More Valuable than You Think: A Simple and Effective Method by Retrieving from Training Data
Shuohang Wang, Yichong Xu, Yuwei Fang, Yang Liu, Siqi Sun, Ruochen Xu, Chenguang Zhu and Michael Zeng. ACL 2022. -
DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization
Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng. AAAI 2022. -
Want To Reduce Labeling Cost? GPT-3 Can Help
Shuohang Wang, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng. EMNLP 2021 findings. -
Retrieval Enhanced Model for Commonsense Generation
Han Wang, Yang Liu, Chenguang Zhu, Linjun Shou, Ming Gong, Yichong Xu, Michael Zeng. ACL 2021 findings. [code] -
DialogSum: A Real-Life Scenario Dialogue Summarization Dataset
Yulong Chen , Yang Liu , Liang Chen, Yue Zhang. ACL 2021 findings. [dataset] -
MediaSum: A Large-scale Media Interview Dataset for Dialogue Summarization
Chenguang Zhu, Yang Liu, Jie Mei, Michael Zeng. NAACL 2021. (* indicates equal contribution) [dataset] -
Noisy Self-Knowledge Distillation for Text Summarization
Yang Liu, Sheng Shen and Mirella Lapata. NAACL 2021. -
QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization
Ming Zhong, Da Yin, Tao Yu, Ahmad Zaidi, Mutethia Mutuma, Rahul Jha, Ahmed Hassan Awadallah, Asli Celikyilmaz, Yang Liu, Xipeng Qiu, Dragomir Radev. NAACL 2021. -
Text Summarization with Pretrained Encoders
Yang Liu and Mirella Lapata. EMNLP 2019. [code] [supplementary material] -
Hierarchical Transformers for Multi-document Summarization
Yang Liu and Mirella Lapata. ACL 2019. [code] -
Generating Summaries with Topic Templates and Structured Convolutional Decoders
Laura Perez-Beltrachini, Yang Liu, Mirella Lapata. ACL 2019. -
Fine-tune BERT for Extractive Summarization
Yang Liu. [code] -
Single Document Summarization as Tree Induction
Yang Liu, Ivan Titov and Mirella Lapata. NAACL 2019. [code]
Services
- Action Editor: ACL Rolling Review
- Senior Area Chair: AACL 2022
- Area Chair: NAACL 2021, COLING 2022
- Conference Program Committee Member: ACL 2016-2021, EMNLP 2017-2021, NAACL 2017-2020
Education
- PhD, Institute for Language, Cognition and Computation, University of Edinburgh
Supervisor: Prof. Mirella Lapata - MSc, Institute of Computational Linguistics, Peking University
Supervisor: Prof. Sujian Li - BE, School of Computer Science, Tianjin University