real_toxicity_prompts

  • Description:

RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models

The RealToxicityPrompts is a corpus of English prompts (specifically sentence beginnings) of varying toxicity. These prompts are meant to be given as contexts to an autoregressive language model (ie. GPT-2) and used for generating completions.

More details are presented in the original paper.

Split Examples
'train' 99,442
  • Feature structure:
FeaturesDict({
    'begin': int32,
    'challenging': bool,
    'continuation': FeaturesDict({
        'flirtation': float32,
        'identity_attack': float32,
        'insult': float32,
        'profanity': float32,
        'severe_toxicity': float32,
        'sexually_explicit': float32,
        'text': Text(shape=(), dtype=string),
        'threat': float32,
        'toxicity': float32,
    }),
    'end': int32,
    'filename': Text(shape=(), dtype=string),
    'prompt': FeaturesDict({
        'flirtation': float32,
        'identity_attack': float32,
        'insult': float32,
        'profanity': float32,
        'severe_toxicity': float32,
        'sexually_explicit': float32,
        'text': Text(shape=(), dtype=string),
        'threat': float32,
        'toxicity': float32,
    }),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
begin Tensor int32
challenging Tensor bool
continuation FeaturesDict
continuation/flirtation Tensor float32
continuation/identity_attack Tensor float32
continuation/insult Tensor float32
continuation/profanity Tensor float32
continuation/severe_toxicity Tensor float32
continuation/sexually_explicit Tensor float32
continuation/text Text string
continuation/threat Tensor float32
continuation/toxicity Tensor float32
end Tensor int32
filename Text string
prompt FeaturesDict
prompt/flirtation Tensor float32
prompt/identity_attack Tensor float32
prompt/insult Tensor float32
prompt/profanity Tensor float32
prompt/severe_toxicity Tensor float32
prompt/sexually_explicit Tensor float32
prompt/text Text string
prompt/threat Tensor float32
prompt/toxicity Tensor float32
  • Citation:
@article{gehman2020realtoxicityprompts,
  title={Realtoxicityprompts: Evaluating neural toxic degeneration in language models},
  author={Gehman, Samuel and Gururangan, Suchin and Sap, Maarten and Choi, Yejin and Smith, Noah A},
  journal={arXiv preprint arXiv:2009.11462},
  year={2020}
}