I am passionate about understanding what knowledge a language model has captured and whether it
can reliably apply this knowledge across diverse (unseen) contexts and perspectives; what will its impact
be in the real world? I am particularly interested in analyzing and developing tools to ensure
responsible deployment, with a big focus on robustness for safety-critical applications, e.g. hate speech detection.
I earned my BSc and MSc degrees in Artificial Intelligence from the University of Amsterdam.
If you want to (briefly) know more about my research interests, see the research page.
Note: if you have recently reached out to me at my VU e-mail and did not get a reply, this is highly likely due to my account being closed. The e-mails do not bounce back. Please reach out to me again at my TU e-mail address.
The subjectivity of automatic hate speech detection makes it a complex task, reflected in different and incomplete definitions in NLP. We present hate speech criteria, developed with insights from a law and social science expert, that help researchers create more explicit definitions and annotation guidelines on five aspects: (1) target groups and (2) dominance, (3) perpetrator characteristics, (4) explicit presence of negative interactions, and the (5) type of consequences/effects. Definitions can be structured so that they cover a more broad or more narrow phenomenon and conscious choices can be made on specifying criteria or leaving them open. We argue that the goal and exact task developers have in mind should determine how the scope of hate speech is defined. We provide an overview of the properties of datasets from hatespeechdata.com that may help select the most suitable dataset for a specific scenario.
@inproceedings{khurana-etal-2022-hate,bibtex_show={true},title={Hate Speech Criteria: A Modular Approach to Task-Specific Hate Speech Definitions},author={Khurana, Urja and Vermeulen, Ivar and Nalisnick, Eric and Van Noorloos, Marloes and Fokkens, Antske},booktitle={Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)},month=jul,year={2022},address={Seattle, Washington (Hybrid)},publisher={Association for Computational Linguistics},url={https://aclanthology.org/2022.woah-1.17},html={https://aclanthology.org/2022.woah-1.17},doi={10.18653/v1/2022.woah-1.17},pages={176--191},selected={true}}
How Emotionally Stable is {ALBERT}? Testing Robustness with Stochastic Weight Averaging on a Sentiment Analysis Task
Khurana, Urja,
Nalisnick, Eric,
and Fokkens, Antske
In Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems
2021
Despite their success, modern language models are fragile. Even small changes in their
training pipeline can lead to unexpected results. We study this phenomenon by examining the robustness of ALBERT (Lan
et al., 2020) in combination with Stochastic Weight Averaging (SWA)—a cheap way
of ensembling—on a sentiment analysis task
(SST-2). In particular, we analyze SWA’s stability via CheckList criteria (Ribeiro et al.,
2020), examining the agreement on errors
made by models differing only in their random
seed. We hypothesize that SWA is more stable
because it ensembles model snapshots taken
along the gradient descent trajectory. We quantify stability by comparing the models’ mistakes with Fleiss’ Kappa (Fleiss, 1971) and
overlap ratio scores. We find that SWA reduces error rates in general; yet the models
still suffer from their own distinct biases (according to CheckList).
@inproceedings{khurana-etal-2021-emotionally,bibtex_show={true},title={How Emotionally Stable is {ALBERT}? Testing Robustness with Stochastic Weight Averaging on a Sentiment Analysis Task},author={Khurana, Urja and Nalisnick, Eric and Fokkens, Antske},booktitle={Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems},month=nov,year={2021},address={Punta Cana, Dominican Republic},publisher={Association for Computational Linguistics},url={https://aclanthology.org/2021.eval4nlp-1.3},html={https://aclanthology.org/2021.eval4nlp-1.3},selected={true},pages={16--31}}