Recently, our team focuses on NLP applications for Embodied AI, Information Extraction, Value-based Communication, and Reinforcement Learning. We welcome you to contact us if you have similar research interests.
Our three paper “Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs”, “Reinforced Query Reasoners for Reasoning-intensive Retrieval Tasks”, and “Adaptive Preference Optimization with Uncertainty-aware Utility Anchor” were accepted to the Main Conference and Findings of EMNLP 2025!
Prior works on joint Information Extraction (IE) typically model instance (e.g., event triggers, entities, roles, relations) interactions by representation enhancement, type dependencies scoring, or global decoding. We find that the previous models generally consider binary type dependency scoring of a pair of instances, and leverage local search such as beam search to approximate global solutions. To better integrate cross-instance interactions, in this work, we introduce a joint IE framework (CRFIE) that formulates joint IE as a high-order Conditional Random Field. Specifically, we design binary factors and ternary factors to directly model interactions between not only a pair of instances but also triplets. Then, these factors are utilized to jointly predict labels of all instances. To address the intractability problem of exact high-order inference, we incorporate a high-order neural decoder that is unfolded from a mean-field variational inference method, which achieves consistent learning and inference. The experimental results show that our approach achieves consistent improvements on three IE tasks compared with our baseline and prior work.
2022
AAAI2022
Span-based semantic role labeling with argument pruning and second-order inference
Zixia* Jia, Zhaohui* Yan, Haoyi Wu, and 1 more author
Proceedings of the AAAI Conference on Artificial Intelligence, Jul 2022
Prior works on joint Information Extraction (IE) typically model instance (e.g., event triggers, entities, roles, relations) interactions by representation enhancement, type dependencies scoring, or global decoding. We find that the previous models generally consider binary type dependency scoring of a pair of instances, and leverage local search such as beam search to approximate global solutions. To better integrate cross-instance interactions, in this work, we introduce a joint IE framework (CRFIE) that formulates joint IE as a high-order Conditional Random Field. Specifically, we design binary factors and ternary factors to directly model interactions between not only a pair of instances but also triplets. Then, these factors are utilized to jointly predict labels of all instances. To address the intractability problem of exact high-order inference, we incorporate a high-order neural decoder that is unfolded from a mean-field variational inference method, which achieves consistent learning and inference. The experimental results show that our approach achieves consistent improvements on three IE tasks compared with our baseline and prior work.
2021
ACL2021
Structural Knowledge Distillation: Tractably Distilling Information for Structured Predictor
Xinyu Wang*, Yong Jiang*, Zhaohui Yan*, and 6 more authors
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, Jul 2021
Knowledge distillation is a critical technique to transfer knowledge between models, typically from a large model (the teacher) to a more fine-grained one (the student). The objective function of knowledge distillation is typically the cross-entropy between the teacher and the student’s output distributions. However, for structured prediction problems, the output space is exponential in size; therefore, the cross-entropy objective becomes intractable to compute and optimize directly. In this paper, we derive a factorized form of the knowledge distillation objective for structured prediction, which is tractable for many typical choices of the teacher and student models. In particular, we show the tractability and empirical effectiveness of structural knowledge distillation between sequence labeling and dependency parsing models under four different scenarios: 1) the teacher and student share the same factorization form of the output structure scoring function; 2) the student factorization produces more fine-grained substructures than the teacher factorization; 3) the teacher factorization produces more fine-grained substructures than the student factorization; 4) the factorization forms from the teacher and the student are incompatible.
2020
ACL2020
Semi-supervised semantic dependency parsing using CRF autoencoders
Zixia Jia, Youmi Ma, Jiong Cai, and 1 more author
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul 2020
Semantic dependency parsing, which aims to find rich bi-lexical relationships, allows words to have multiple dependency heads, resulting in graph-structured representations. We propose an approach to semi-supervised learning of semantic dependency parsers based on the CRF autoencoder framework. Our encoder is a discriminative neural semantic dependency parser that predicts the latent parse graph of the input sentence. Our decoder is a generative neural model that reconstructs the input sentence conditioned on the latent parse graph. Our model is arc-factored and therefore parsing and learning are both tractable. Experiments show our model achieves significant and consistent improvement over the supervised baseline.
You can even add a little note about which of these is the best way to reach you.