النردات

وصف :

مجموعة بيانات التنوع في تقييم الذكاء الاصطناعي للمحادثة من أجل السلامة ( DICES ).

غالبًا ما يتم تدريب أساليب التعلم الآلي وتقييمها باستخدام مجموعات البيانات التي تتطلب فصلًا واضحًا بين الأمثلة الإيجابية والسلبية. يبسط هذا النهج بشكل مفرط الذاتية الطبيعية الموجودة في العديد من المهام وعناصر المحتوى. كما أنه يحجب التنوع المتأصل في التصورات والآراء البشرية. غالبًا ما تكون المهام التي تحاول الحفاظ على التباين في المحتوى والتنوع لدى البشر باهظة الثمن وشاقة. لملء هذه الفجوة وتسهيل المزيد من تحليلات أداء النموذج المتعمقة، نقترح مجموعة بيانات DICES - وهي مجموعة بيانات فريدة ذات وجهات نظر متنوعة حول سلامة المحادثات التي يتم إنشاؤها بواسطة الذكاء الاصطناعي. نحن نركز على مهمة تقييم سلامة أنظمة الذكاء الاصطناعي للمحادثة. تحتوي مجموعة بيانات DICES على معلومات ديموغرافية تفصيلية حول كل مقيم، وتكرار عالٍ للغاية للتقييمات الفريدة لكل محادثة لضمان الأهمية الإحصائية لمزيد من التحليلات وترميز أصوات المُقيّمين كتوزيعات عبر مجموعات سكانية مختلفة للسماح باستكشافات متعمقة لاستراتيجيات تجميع التصنيفات المختلفة.

مجموعة البيانات هذه مناسبة تمامًا لمراقبة وقياس التباين والغموض والتنوع في سياق سلامة الذكاء الاصطناعي للمحادثة. مجموعة البيانات مصحوبة بورقة تصف مجموعة من المقاييس التي توضح كيف يؤثر تنوع المُقيّمين على إدراك السلامة للمقيمين من مختلف المناطق الجغرافية والمجموعات العرقية والفئات العمرية والجنس. الهدف من مجموعة بيانات DICES هو استخدامها كمعيار مشترك لتقييم سلامة أنظمة الذكاء الاصطناعي للمحادثة.

تحذير المحتوى : تحتوي مجموعة البيانات هذه على أمثلة عدائية لمحادثات قد تكون مسيئة.

الصفحة الرئيسية : https://github.com/google-research-datasets/dices-dataset
كود المصدر : tfds.datasets.dices.Builder
الإصدارات :
- 1.0.0 (افتراضي): الإصدار الأولي.
المفاتيح الخاضعة للإشراف (راجع as_supervised doc ): None
الشكل ( tfds.show_examples ): غير مدعوم.
الاقتباس :

@article{aroyo2024dices,
  title={ {DICES} dataset: Diversity in conversational {AI} evaluation for safety},
  author={Aroyo, Lora and Taylor, Alex and Diaz, Mark and Homan, Christopher and Parrish, Alicia and Serapio-Garc{\'\i}a, Gregory and Prabhakaran, Vinodkumar and Wang, Ding},
  journal={Advances in Neural Information Processing Systems},
  volume={36},
  year={2024}
}

النردات/350 (التكوين الافتراضي)

وصف التكوين : تحتوي مجموعة البيانات 350 على 350 محادثة تم تصنيفها بواسطة مجموعة متنوعة من المقيمين تضم 123 مقيمًا فريدًا. يتم تصنيف كل محادثة من خلال خمس فئات عالية المستوى للسلامة وسؤال واحد شامل للمحادثة. تم تعيين المقيمين بشكل متوازن حسب الجنس (رجل أو امرأة)، والعرق/الإثنية (أبيض، أسود، لاتيني، آسيوي، متعدد الأعراق) وقام كل مقيم بتقييم جميع العناصر. قام كل مقيم بتقييم جميع المحادثات. تحتوي كل محادثة على 123 تقييمًا فريدًا. إجمالي عدد الصفوف في مجموعة البيانات هذه هو 43050.
حجم التحميل : 29.70 MiB
حجم مجموعة البيانات : 74.43 MiB
التخزين المؤقت التلقائي ( الوثائق ): نعم
الإنشقاقات :

ينقسم	أمثلة
`'train'`	43,050

هيكل الميزة :

FeaturesDict({
    'Q1_whole_conversation_evaluation': ClassLabel(shape=(), dtype=int64, num_classes=4),
    'Q2_harmful_content_dangerous_content': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_derogation_of_boat': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_insensitive': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_medical_legal_financial_or_relationship_advice': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_obscene_and_profane': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_other': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_overall': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_personal_information': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_regulated_goods': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_sexually_suggestive_content': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_other': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_overall': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_targeting_beliefs': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_targeting_gender_sexual_orientation': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_targeting_inherited_attributes': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_targeting_status': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q4_misinformation': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q5_political_affiliation': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q6_policy_guidelines_making_endorsement': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q6_policy_guidelines_other': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q6_policy_guidelines_overall': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q6_policy_guidelines_polarizing_topics': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q_overall': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'answer_time_ms': Scalar(shape=(), dtype=int64, description=Amount of time spent by each rater on each safety annotation question.),
    'answer_timestamp': Scalar(shape=(), dtype=int64, description=Time when each conversation was rated by each rater.),
    'context': Text(shape=(), dtype=string),
    'degree_of_harm': ClassLabel(shape=(), dtype=int64, num_classes=5),
    'harm_type': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=32)),
    'id': Scalar(shape=(), dtype=int64, description=Numerical identifier for each row, representing all ratings by a single rater to a single conversation.),
    'item_id': Scalar(shape=(), dtype=int64, description=Numerical identifier for each conversation.),
    'phase': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'rater_age': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'rater_education': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'rater_gender': ClassLabel(shape=(), dtype=int64, num_classes=4),
    'rater_id': Scalar(shape=(), dtype=int64, description=Numerical identifier for each rater.),
    'rater_race': ClassLabel(shape=(), dtype=int64, num_classes=6),
    'rater_raw_race': Text(shape=(), dtype=string),
    'response': Text(shape=(), dtype=string),
    'safety_gold': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'safety_gold_reason': Text(shape=(), dtype=string),
})

وثائق الميزة :

ميزة	فصل	شكل	نوع D	وصف
	المميزاتDict
Q1_whole_conversation_evaluation	ClassLabel		int64	التقييم حول مدى فهم المحادثة.
Q2_harmful_content_dangerous_content	ClassLabel		int64
Q2_harmful_content_derogation_of_boat	ClassLabel		int64
السؤال الثاني: المحتوى الضار_غير حساس	ClassLabel		int64
السؤال الثاني_المحتوى_الضار_الطبي_القانوني_المالي_أو_العلاقة_النصيحة	ClassLabel		int64
Q2_harmful_content_obscene_and_profane	ClassLabel		int64
Q2_harmful_content_other	ClassLabel		int64
Q2_harmful_content_overall	ClassLabel		int64
Q2_harmful_content_personal_information	ClassLabel		int64
السؤال الثاني: المحتوى الضار، السلع المنظمة	ClassLabel		int64
السؤال الثاني_المحتوى_الضار_الجنسي_المحتوى_الإيحائي	ClassLabel		int64
Q3_bias_other	ClassLabel		int64
Q3_bias_overall	ClassLabel		int64
Q3_bias_targeting_beliefs	ClassLabel		int64
Q3_bias_targeting_gender_sexual_orientation	ClassLabel		int64
Q3_bias_targeting_inherited_attributes	ClassLabel		int64
Q3_bias_targeting_status	ClassLabel		int64
Q4_معلومات خاطئة	ClassLabel		int64
س5_الانتماء_السياسي	ClassLabel		int64
Q6_policy_guidelines_make_endorsement	ClassLabel		int64
Q6_policy_guidelines_other	ClassLabel		int64
Q6_policy_guidelines_overall	ClassLabel		int64
Q6_policy_guidelines_polarizing_topics	ClassLabel		int64
س_عموما	ClassLabel		int64
Answer_time_ms	العددية		int64	مقدار الوقت الذي يقضيه كل مقيم في كل سؤال توضيحي للسلامة.
Answer_timestamp	العددية		int64	الوقت الذي تم فيه تقييم كل محادثة بواسطة كل مقيم.
سياق	نص		خيط	تتحول المحادثة قبل الرد النهائي لبرنامج chatbot.
درجة_الضرر	ClassLabel		int64	تصنيف مشروح يدويًا لشدة مخاطر السلامة.
Harm_type	التسلسل (ClassLabel)	(لا أحد،)	int64	موضوع (موضوعات) المحادثة الضارة المشروحة يدويًا.
بطاقة تعريف	العددية		int64	معرف رقمي لكل صف، يمثل جميع التقييمات بواسطة مقيم واحد لمحادثة واحدة.
item_id	العددية		int64	المعرف الرقمي لكل محادثة.
مرحلة	ClassLabel		int64	واحدة من ثلاث فترات زمنية متميزة.
Rate_age	ClassLabel		int64	الفئة العمرية للمقيم.
Rate_education	ClassLabel		int64	تعليم المعدل.
Rate_gender	ClassLabel		int64	جنس المُقيم.
Rate_id	العددية		int64	المعرف الرقمي لكل مقيم.
Rate_race	ClassLabel		int64	العرق / العرق للمقيم.
Rate_raw_race	نص		خيط	العرق/الإثنية الخام التي تم الإبلاغ عنها ذاتيًا للمقيم، قبل التبسيط إلى خمس فئات.
إجابة	نص		خيط	الرد النهائي لـ chatbot في المحادثة.
Safety_gold	ClassLabel		int64	علامة السلامة القياسية الذهبية المقدمة من الخبراء.
Safety_gold_reason	نص		خيط	السبب (الأسباب) (إن وجد) لعلامة السلامة الذهبية المقدمة من الخبراء.

أمثلة ( tfds.as_dataframe ):

النردات/990

وصف التكوين : تحتوي مجموعة البيانات 990 على 990 محادثة تم تصنيفها بواسطة مجموعة متنوعة من المقيمين تضم 173 مقيمًا فريدًا. يتم تصنيف كل محادثة ضمن ثلاث فئات عالية المستوى للسلامة وسؤال واحد شامل لفهم المحادثة. تم تعيين المقيمين بحيث يكون عدد المقيمين لكل محادثة متوازنًا حسب الجنس (رجل، امرأة) واللغة (الولايات المتحدة والهند). قام كل مقيم بتقييم عينة فقط من المحادثة. تحتوي كل محادثة على 60-70 تقييمًا فريدًا. إجمالي عدد الصفوف في مجموعة البيانات هذه هو 72103.
حجم التحميل : 48.06 MiB
حجم مجموعة البيانات : 150.38 MiB
التخزين المؤقت التلقائي ( الوثائق ): فقط عندما shuffle_files=False (القطار)
الإنشقاقات :

ينقسم	أمثلة
`'train'`	72,103

هيكل الميزة :

FeaturesDict({
    'Q1_whole_conversation_evaluation': ClassLabel(shape=(), dtype=int64, num_classes=4),
    'Q2_harmful_content_explicitly_sexual': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_financial_advice': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_harm_to_people_or_animals': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_health_and_safety_claims': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_how_tos_about_harmful_activities': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_obscenities_or_profanities': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_other': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_overall': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_personal_information': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_promotes_or_condones_use_of_harmful_regulated_goods': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_promotes_or_condones_violence': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_violent_or_gory': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_3_Racial_religious_or_other_slurs_and_stereotypes_that_incites_or_promote_hatred_based_on_any_of_the_attributes_noted_above': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_calling_individuals_or_groups_evil_corrupt_or_malicious': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_incites_hatred': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_insults_about_emotional_romantic_and_or_sexual_attraction': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_other': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_physically_or_mentally_inferior': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_subhuman_comparisons': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_superiority_or_subjugation_of_a_group': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_unfair_bias_overall': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q4_misinformation_conspiracy_theories': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q4_misinformation_contradicts_expert_consensus': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q4_misinformation_false_theories': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q4_misinformation_other': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q4_misinformation_overall': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q4_misinformation_political': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q_overall': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'context': Text(shape=(), dtype=string),
    'degree_of_harm': ClassLabel(shape=(), dtype=int64, num_classes=5),
    'harm_type': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=32)),
    'id': Scalar(shape=(), dtype=int64, description=Numerical identifier for each row, representing all ratings by a single rater to a single conversation.),
    'item_id': Scalar(shape=(), dtype=int64, description=Numerical identifier for each conversation.),
    'phase': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'rater_age': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'rater_education': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'rater_gender': ClassLabel(shape=(), dtype=int64, num_classes=4),
    'rater_id': Scalar(shape=(), dtype=int64, description=Numerical identifier for each rater.),
    'rater_locale': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'rater_race': ClassLabel(shape=(), dtype=int64, num_classes=6),
    'rater_raw_race': Text(shape=(), dtype=string),
    'response': Text(shape=(), dtype=string),
})

وثائق الميزة :

ميزة	فصل	شكل	نوع D	وصف
	المميزاتDict
Q1_whole_conversation_evaluation	ClassLabel		int64	التقييم حول مدى فهم المحادثة.
السؤال الثاني_المحتوى_الضار_الجنسي الصريح	ClassLabel		int64
السؤال الثاني_المحتوى_الضار_النصيحة_المالية	ClassLabel		int64
السؤال الثاني_المحتوى_الضار_يضر_بالأشخاص_أو_الحيوانات	ClassLabel		int64
السؤال الثاني_المحتوى_الضار_الصحي_و_السلامة_المطالبات	ClassLabel		int64
Q2_harmful_content_how_tos_about_harmful_activities	ClassLabel		int64
السؤال الثاني_المحتوى_الضار_البذاءات_أو_الألفاظ النابية	ClassLabel		int64
Q2_harmful_content_other	ClassLabel		int64
Q2_harmful_content_overall	ClassLabel		int64
Q2_harmful_content_personal_information	ClassLabel		int64
السؤال الثاني_المحتوى_الضار_الترويج_أو_الموافقات_استخدام_السلع_الضارة_المنظمه	ClassLabel		int64
السؤال الثاني_المحتوى_الضار_يروج_أو_يتغاضى_عن_العنف	ClassLabel		int64
Q2_harmful_content_violent_or_gory	ClassLabel		int64
Q3_3_العنصرية_الدينية_أو_الافتراءات_الأخرى_والصور النمطية_التي_تثير_أو_تروج_للقبعات_الحمراء_على_أي_من_السمات_المذكورة_أعلاه	ClassLabel		int64
السؤال الثالث: التحيز في الاتصال بالأفراد_أو_المجموعات_الشر_الفاسد_أو_الضار	ClassLabel		int64
س3_التحيز_يحرض_على_الكراهية	ClassLabel		int64
السؤال الثالث_التحيز_الإهانات_حول_العاطفة_الرومانسية_أو_الانجذاب_الجنسي	ClassLabel		int64
Q3_bias_other	ClassLabel		int64
Q3_التحيز_جسديًا_أو_عقليًا_أدنى	ClassLabel		int64
Q3_bias_subhuman_comparisons	ClassLabel		int64
Q3_التحيز_للتفوق_أو_إخضاع_المجموعة	ClassLabel		int64
Q3_تحيز_غير عادل_بشكل عام	ClassLabel		int64
س4_معلومات_خاطئة_نظريات_المؤامرة	ClassLabel		int64
Q4_misinformation_contradicts_expert_consensus	ClassLabel		int64
Q4_misinformation_false_theories	ClassLabel		int64
Q4_misinformation_other	ClassLabel		int64
Q4_misinformation_overall	ClassLabel		int64
س4_معلومات_سياسية	ClassLabel		int64
س_عموما	ClassLabel		int64
سياق	نص		خيط	تتحول المحادثة قبل الرد النهائي لبرنامج chatbot.
درجة_الضرر	ClassLabel		int64	تصنيف مشروح يدويًا لشدة مخاطر السلامة.
Harm_type	التسلسل (ClassLabel)	(لا أحد،)	int64	موضوع (موضوعات) المحادثة الضارة المشروحة يدويًا.
بطاقة تعريف	العددية		int64	معرف رقمي لكل صف، يمثل جميع التقييمات بواسطة مقيم واحد لمحادثة واحدة.
item_id	العددية		int64	المعرف الرقمي لكل محادثة.
مرحلة	ClassLabel		int64	واحدة من ثلاث فترات زمنية متميزة.
Rate_age	ClassLabel		int64	الفئة العمرية للمقيم.
Rate_education	ClassLabel		int64	تعليم المعدل.
Rate_gender	ClassLabel		int64	جنس المُقيم.
Rate_id	العددية		int64	المعرف الرقمي لكل مقيم.
Rate_locale	ClassLabel		int64	لغة المقيم.
Rate_race	ClassLabel		int64	العرق / العرق للمقيم.
Rate_raw_race	نص		خيط	العرق/الأصل العرقي الخام الذي تم الإبلاغ عنه ذاتيًا للمقيم، قبل التبسيط إلى خمس فئات.
إجابة	نص		خيط	الرد النهائي لـ chatbot في المحادثة.

أمثلة ( tfds.as_dataframe ):

النردات تنظيم صفحاتك في مجموعات يمكنك حفظ المحتوى وتصنيفه حسب إعداداتك المفضّلة.

مجموعة بيانات التنوع في تقييم الذكاء الاصطناعي للمحادثة من أجل السلامة ( DICES ).

النردات/350 (التكوين الافتراضي)

النردات/990

النردات