| CARVIEW |
2020 The 6th Workshop on Noisy User-generated Text (W-NUT)
Nov 19, 2020 -- WNUT workshop is going virtual together with EMNLP 2020
The WNUT workshop focuses on Natural Language Processing applied to noisy user-generated text, such as that found in social media, online reviews, crowdsourced data, web forums, clinical records and language learner essays. The workshop hashtag is #wnut.
News! We will hold our workshop completely live online (registration for EMNLP 2020 is now open) -- 4 live invited talks with QA, 1-min or 5-min live talks for 33 regular papers, as well as interactive social event for two different time zones (4:00-8:00 GMT and 15:00-19:00 GMT -- click for a more detailed schedule). We accepted 33 regular workshop papers and 47 shared-task papers.
We are organizing three shared-tasks:
(1) Entity and relation recognition over wet-lab protocols. Data is released on June 08, 2020! Official evaluation will be August 31 ~ September 4 (entity) and September 9 ~ September 15 (relation), 2020.
(2) Identification of informative COVID-19 English Tweets. Data is released on June 21, 2020! Official evaluation will be August 17 ~ 21, 2020.
(3) COVID-19 Event Extraction from Twitter. Data is released on June 22, 2020! Official evaluation will be September 7 ~ 11, 2020.
Congratulations to the winners of the best paper awards, which are sponsored by Twitter this year:
- Detecting Objectifying Language in Online Professor Reviews
Angie Waller and Kyle Gorman - Fine-Tuning MT systems for Robustness to Second-Language Speaker Variations
Md Mahfuz Ibn Alam and Antonios Anastasopoulos
Workshop Organizers
- Wei Xu (Georgia Institute of Technology)
- Afshin Rahimi (University of Queensland)
- Alan Ritter (Georgia Institute of Technology)
- Tim Baldwin (University of Melbourne)
- Leon Derczynski (IT University of Copenhagen)
Invited Speakers
- Eduardo Blanco (University of North Texas)
- Manaal Faruqui (Google)
- Robert Munro (Machine Learning Consulting; former CTO of Figure Eight)
- Irwin King (The Chinese University of Hong Kong)
Important Dates
- Submission Deadline: August 25, 2020 (anytime on earth; dual-submission allowed)
- Retraction of workshop papers accepted for EMNLP main conference: September 15, 2020
- Reviews Due: September 20, 2020
- Acceptance Notification: September 29, 2020
- Retraction of workshop papers accepted for COLING main conference: October 2, 2020
- Camera-Ready Deadline:
October 8, 2020October 12, 2020 - Workshop Day: November 19, 2020
Program (accepted papers)
Call for Papers
We seek submissions of long and short papers on original and unpublished work (same page limit EMNLP main conference). All accepted submissions will be presented as pre-recorded talks at the workshop, following the EMNLP 2020 main conference (more details here).
Topics of interest include but are not limited to:
- NLP Preprocessing of Noisy Text
- Part of speech tagging
- Named entity tagging, including a wide range of categories, e.g. product names
- Chunking of user-generated text
- Parsing
- Text Normalization and Error Correction
- Normalizing noisy text for downstream tasks and for human readability
- Error detection and correction
- Robustness to Noise, both Natural and Adversarial
- Multilingual NLP in noisy text
- Machine Translation of Noisy Text
- Sentiment analysis
- Crowdsourcing of text data
- User prediction, e.g. gender, age, etc
- Stylistics, e.g. formality, politeness, etc
- Colloquial language, e.g. code-switching, idiom detection
- Bilingual translation of the noisy text
- Paraphrase identification and semantic similarity of short text or noisy text
- Information extraction from noisy text
- Domain adaptation to user-generated text
- Geolocation prediction
- Global and regional trend detection and event extraction
- Detecting rumors, contradictory information, sarcasm and humor on social media
- Extracting user demographics, profiles, and major life events
- Temporal aspects of user-generated content (resolving time expressions, concept drift, diachronic analyses, etc...)
Double Submission Policy: Papers that have been or will be submitted to other meetings or publications must indicate at submission time. Authors of a paper accepted for presentation must notify the workshop organizers by the camera-ready deadline as to whether the paper will be presented or withdrawn.
Shared task 1: Entity and Relation Extraction over Wet Lab Protocols
Lab protocols specify steps in performing a lab procedure. They are noisy, dense, and domain-specific. Automatic or semi-automatic conversion of protocols into machine-readable format benefits biological research. In this task, system entries are invited for event recognition and relation extraction over these lab protocols. Note that these protocols are written by researchers and lab technicians worldwide, some of which may contain non-standard language or spelling errors. Here's a sample of the input data:
Initial data is released on June 8, 2020. Please register here to receive future data for the official evaluation (Aug 31 - Sep 4, 2020).
Details on the shared task are here. Contacts: Jeniya Tabassum, Wei Xu, Alan Ritter.
Shared task 2: Identification of informative COVID-19 English Tweets
The goals of this shared task are: (1) To develop a language processing task that potentially impacts research and downstream applications, and (2) To provide the community with a new dataset for identifying informative COVID-19 English Tweets.
For this task, participants are asked to develop systems that automatically identify whether an English Tweet related to the novel coronavirus (COVID-19) is informative or not. Such informative Tweets provide information about recovered, suspected, confirmed and death cases as well as location or travel history of the cases. The dataset and systems developed for this shared task will be beneficial for the development of COVID-19 related monitoring systems.
Details on the shared task are here. Contacts: Dat Quoc Nguyen, Thanh Vu, Afshin Rahimi.
Shared task 3: Extracting COVID-19 Events from Twitter
People usually share a wide variety of information related to COVID-19 publicly on social media. For example, Twitter users often indicate when they might be at increased risk of COVID-19 due to a coworker or other close contact testing positive for the virus, or when they have symptoms but were denied access to testing. In this shared task, participants are invited to develop systems that automatically extract COVID-19 related events from Twitter using our newly built corpus. Here is an example of our annotated data:
Initial data has been released on June 22, 2020. Please register here to receive future data for the official evaluation (Sep 7 - Sep 11, 2020).
Details on the shared task are here. Contacts: Shi Zong, Wei Xu, Alan Ritter.
Program Committee
- Muhammad Abdul-Mageed (University of British Columbia)
- Željko Agić (Corti)
- Sweta Agrawal (University of Maryland)
- Gustavo Aguilar (University of Houston)
- Nikolaos Aletras (University of Sheffield)
- Rahul Aralikatte (University of Copenhagen)
- Eiji Aramaki (NAIST)
- JinYeong Bak (Sungkyunkwan University)
- Francesco Barbieri (Universitat Pompeu Fabra)
- John Beieler (ODNI Science and Technology)
- Eric Bell (PNNL)
- Anya Belz (University of Brighton)
- Adrian Benton (JHU)
- Eduardo Blanco (University of North Texas)
- Su Lin Blodgett (UMass Amherst)
- Julian Brooke (University of British Columbia)
- Cornelia Caragea (University of Illinois at Chicago)
- Tuhin Chakrabarty (Columbia University)
- Stevie Chancellor (Northwestern University)
- Mingda Chen (Toyota Technological Institute at Chicago)
- Sihao Chen (University of Pennsylvania)
- Dhivya Chinnappa (Thomson Reuters)
- Colin Cherry (Google)
- Zewei Chu (University of Chicago)
- Manuel R. Ciosici (IT University of Copenhagen)
- Oana Cocarascu (Imperial College London)
- Nigel Collier (University of Cambridge)
- Çağrı Çöltekin (University of Tübingen)
- Paul Cook (University of New Brunswick)
- Marina Danilevsky (IBM Research)
- Pradipto Das (Rakuten Institute of Technology)
- Leon Derczynski (IT University of Copenhagen)
- Jay DeYoung (Northeastern University)
- Bhuwan Dhingra (Carnegie Mellon University)
- Seza Doğruöz (Tilburg University)
- Xinya Du (Cornell University)
- Jacob Eisenstein (Google)
- Heba Elfardy (Amazon)
- Micha Elsner (Ohio State University)
- Alexander Fabbri (Yale University)
- Manaal Faruqui (Google)
- Yansong Feng (Peking University)
- Catherine Finegan-Dollak (IBM Research)
- Tim Finin (UMBC)
- Lucie Flek (Mainz University of Applied Sciences)
- Lisheng Fu (New York University)
- Yoshinari Fujinuma (University of Colorado, Boulder)
- Juri Ganitkevitch (Google)
- Dan Garrette (Google)
- Sahil Garg (University of Southern California)
- Spandana Gella (Amazon)
- Debanjan Ghosh (MIT)
- Kevin Gimpel (TTIC)
- Amit Goyal (Amazon)
- Yvette Graham (Dublin City University)
- Chulaka Gunasekara (IBM Research)
- Mika Hämäläinen (University of Helsinki)
- William L. Hamilton (McGill University/MILA)
- Xiaochuang Han (Carnegie Mellon University)
- Devamanyu Hazarika (National University of Singapore)
- Hua He (Amazon)
- Jack Hessel (Cornell University)
- Graeme Hirst (University of Toronto)
- Nathan Hodas (PNNL)
- Junjie Hu (Carnegie Mellon University)
- Dirk Hovy (Bocconi University)
- Binxuan Huang (Carnegie Mellon University)
- Sarthak Jain (Northeastern University)
- Nanjiang Jiang (Ohio State University)
- Lifeng Jin (Ohio State University)
- Ishan Jindal (IBM Research)
- Kristen Johnson (Michigan State University)
- Kenny Joseph (University at Buffalo)
- Katharina Kann (University of Colorado, Boulder)
- David Kauchak (Pomona College)
- Ashique KhudaBukhsh (Carnegie Mellon University)
- Roman Klinger (University of Stuttgart)
- Hayato Kobayashi (Yahoo! Research)
- Ekaterina Kochmar (University of Cambridge)
- Reno Kriz (University of Pennsylvania)
- Sachin Kumar (Carnegie Mellon University)
- Vivek Kulkarni (Stanford University)
- Jonathan Kummerfeld (University of Michigan)
- Ophélie Lacroix (Siteimprove)
- Wuwei Lan (Ohio State University)
- Jiwei Li (ShannonAI)
- Jessy Junyi Li (University of Texas Austin)
- Jing Li (Hong Kong Polytechnic University)
- Yitong Li (University of Melbourne)
- Nut Limsopatham (University of Glasgow)
- Zhiyuan Liu (Tsinghua University)
- Fei Liu (University of Melbourne)
- Nikola Ljubešić (Jožef Stefan Institute)
- Wei-Yun Ma (Academia Sinica)
- Mounica Maddela (Georgia Institute of Technology)
- Peter Makarov (University of Zurich)
- Héctor Martínez Alonso (Apple)
- Aaron Masino (The Children's Hospital of Philadelphia)
- Nitika Mathur (University of Melbourne)
- Ahmed Mourad (RMIT University)
- Yasuhide Miura (Fuji Xerox)
- Hamdy Mubarak (Qatar Computing Research Institute)
- Graham Mueller (Leidos)
- Maria Nadejde (Grammarly)
- Guenter Neumann (German Research Center for Artificial Intelligence)
- Vincent Ng (University of Texas at Dallas)
- Thien Huu Nguyen (University of Oregon)
- Eric Nichols (Honda Research Institute)
- Tong Niu (University of North Carolina at Chapel-Hill)
- Benjamin Nye (Northeastern University)
- Alice Oh (KAIST)
- Naoaki Okazaki (Tohoku University)
- Naoki Otani (CMU)
- Myle Ott (Facebook AI)
- Symeon Papadopoulos (CERTH-ITI)
- Umashanthi Pavalanathan (Georgia Tech)
- Yuval Pinter (Georgia Tech)
- Christopher Potts (Stanford University)
- Vinodkumar Prabhakaran (Stanford University)
- Daniel Preoţiuc-Pietro (Bloomberg)
- Ella Rabinovich (University of Toronto)
- Dianna Radpour (University of Colorado Boulder)
- Preethi Raghavan (IBM Research)
- Afshin Rahimi (University of Queensland)
- Revanth Rameshkumar (Microsoft Research)
- Adithya Renduchintala (JHU)
- Carolyn Rose (CMU)
- Alla Rozovskaya (City University of New York)
- Derek Ruths (McGill University)
- Koustuv Saha (Georgia Tech)
- Keisuke Sakaguchi (Allen Institute for Artificial Intelligence)
- Maarten Sap (University of Washington)
- Amirreza Shirani (University of Houston)
- Dan Simonson (BlackBoiler)
- Kevin Small (Amazon)
- Jan Šnajder (University of Zagreb)
- Xingyi Song (University of Sheffield)
- Evangelia Spiliopoulou (Carnegie Mellon University)
- Gabriel Stanovsky (Allen Institute for Artificial Intelligence)
- Ian Stewart (Georgia Tech)
- Nadiya Straton (Copenhagen Business School)
- Shivashankar Subramanian (University of Melbourne)
- Jeniya Tabassum (Ohio State University)
- Yi Tay (Google)
- Zhiyang Teng (Westlake University)
- Joel Tetreault (Dataminr)
- James Thorne (University of Cambridge)
- Rob van der Goot (University of Groningen)
- Vasudeva Varma (IIIT Hyderabad)
- Daniel Varab (IT University of Copenhagen)
- Olga Vechtomova (University of Waterloo)
- Nikhita Vedula (Ohio State University)
- Alakananda Vempala (Bloomberg)
- Rob Voigt (Northwestern University)
- Soroush Vosoughi (Dartmouth University)
- Xiaojun Wan (Peking University)
- Zeerak Waseem (University of Sheffield)
- Zhongyu Wei (Fudan University)
- Hong Wei (University of Maryland)
- Steven Wilson (University of Edinburgh)
- Zach Wood-Doughty (Johns Hopkins University)
- Ning Yu (Leidos)
- Marcos Zampieri (Rochester Institute of Technology)
- Guido Zarrella (MITRE)
- Vicky Zayats (University of Washington)
- Justine Zhang (Cornell University)
- Xiao Zhang (Purdue University)
- Shi Zong (Ohio State University)
Sponsored by
Anti-harassment Policy