You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm Yuancheng Wang (王远程), a PhD student at the Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), SDS, supervised by Prof. Zhizheng Wu. before that, I received the B.S. degree at CUHK-Shenzhen.
My research interests include Multi-modal LLM, Generative AI for Speech and Audio, Post-Training, and Representation Learning. I am currently a research scientist intern at Meta Superintelligence Labs, working on enhancing the speech capabilities of Llama models. Previously, I have also interned at Microsoft Research Asia (MSRA) and ByteDance.
I have developed several advanced TTS models, including NaturalSpeech 3 and MaskGCT, and I am one of the main contributors and leaders of the open-source Amphion Amphion toolkit. My work has been published at top international AI conferences such as NeurIPS, ICML, ICLR, ACL, and IEEE SLT.
I am looking for a full-time position now, feel free to contact me if you are interested in my experience!
This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Lan…
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…