| CARVIEW |
Biography
Hi! I am Wenyan. My research focuses on building and interpreting multimodal and NLP models. I completed my PhD with a focus on multimodal learning at the CoAStaL NLP Group, University of Copenhagen, where I was supervised by Anders Søgaard.
I was also a senior NLP Researcher at Sensetime and Comcast AI Research Lab. Before that, I spent a wonderful time at University of Maryland, College Park for my MS, where I worked with Prof. Jordan Boyd-Graber on Natural Language Processing.
Feel free to reach out for collaboration on related projects.
In my free time, I enjoy painting, cooking, yoga, and table tennis :)
NEWS:
- 10/2025 – I have successfully defended my PhD!
- 08/2025 – Two papers (one main and one findings) are accepted to EMNLP 2025!
- 07/2025 – CultureCLIP is accepted to COLM 2025!
- I’m currently on my research visit at RycoLab in Zürich until end of June 2025.
- 11/2024 – Invited talk and short visit at MIT.
- 11/2024 – Our W1KP paper won the Outstanding Paper Award at EMNLP 2024!
- 11/2024 – I will present FoodieQA in EMNLP 2024, see you in Miami!
- 09/2024 – FoodieQA and W1KP are accepted to EMNLP 2024 main conference!
- 05/2024 – One paper accepted to ACL 2024 main conference!
- Multimodal Learning
- Natural Language Processing
- Information Retrieval
- Speech Technology
-
Ph.D. in Computer Science, 2022-2025
University of Copenhagen, Denmark
-
M.S. in EECS (Thesis Track), 2016-2018
University of Maryland, College Park
Selected Publications
Experience
- Knowledge-enhanced QA and dialogue system * Multimodal and prompt learning
- Designed an unsupervised auto-annotation system for voice queries with user behavioral modeling to automatically identify errors in speech recognition and NLP systems and suggest corrections
- Built an active learning pipeline with auto-labeled user transcriptions to improve ASR system for comcast X1, increasing system recognition accuracy by 9% (summarized the work into a conference paper as the first-author and filed a patent as the main inventor)
- Developed a context-based approach that discovered misclassified user queries in question answering systems by performing semantic search with Sentence-BERT
- Leveraged subword-level query representation and adversarial training in customer care dialogue system for misspelled user queries, which improved classification accuracy by 18% and increased user experience stability
- Mentored interns and new-hires on projects relevant to multi-task learning and query representation