You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Evaluation dataset for the paper "A Holistic Approach to Undesired Content Detection"
The evaluation dataset data/samples-1680.jsonl.gz is the test set used in the following paper:
@article{openai2022moderation,
title={A Holistic Approach to Undesired Content Detection},
author={Todor Markov and Chong Zhang and Sandhini Agarwal and Tyna Eloundou and Teddy Lee and Steven Adler and Angela Jiang and Lilian Weng},
journal={arXiv preprint arXiv:2208.03274},
year={2022}
}
Each line contains information about one sample in a JSON object and each sample is labeled according to our taxonomy. The category label is a binary flag, but if it does not include in the JSON, it means we do not know the label.
Category
Label
Definition
sexual
S
Content meant to arouse sexual excitement, such as the description of sexual activity, or that promotes sexual services (excluding sex education and wellness).
hate
H
Content that expresses, incites, or promotes hate based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste.
violence
V
Content that promotes or glorifies violence or celebrates the suffering or humiliation of others.
harassment
HR
Content that may be used to torment or annoy individuals in real life, or make harassment more likely to occur.
self-harm
SH
Content that promotes, encourages, or depicts acts of self-harm, such as suicide, cutting, and eating disorders.
sexual/minors
S3
Sexual content that includes an individual who is under 18 years old.
hate/threatening
H2
Hateful content that also includes violence or serious harm towards the targeted group.
violence/graphic
V2
Violent content that depicts death, violence, or serious physical injury in extreme graphic detail.