تاس

توضیحات :

مجموعه داده های تنوع در ارزیابی هوش مصنوعی مکالمه برای ایمنی ( DICES ).

رویکردهای یادگیری ماشینی اغلب با مجموعه داده هایی که نیاز به جدایی واضح بین مثال های مثبت و منفی دارند آموزش داده و ارزیابی می شوند. این رویکرد، ذهنیت طبیعی موجود در بسیاری از وظایف و موارد محتوا را بیش از حد ساده می کند. همچنین تنوع ذاتی در ادراکات و نظرات بشر را پنهان می کند. اغلب کارهایی که سعی در حفظ تنوع در محتوا و تنوع در انسان دارند بسیار پرهزینه و پرزحمت هستند. برای پر کردن این شکاف و تسهیل تجزیه و تحلیل عملکرد مدل عمیق تر، ما مجموعه داده DICES را پیشنهاد می کنیم - مجموعه داده ای منحصر به فرد با دیدگاه های متنوع در مورد ایمنی مکالمات ایجاد شده توسط هوش مصنوعی. ما بر وظیفه ارزیابی ایمنی سیستم‌های هوش مصنوعی مکالمه‌ای تمرکز می‌کنیم. مجموعه داده DICES حاوی اطلاعات جمعیتی دقیق در مورد هر رتبه‌دهنده، تکرار بسیار زیاد رتبه‌بندی‌های منحصربه‌فرد در هر مکالمه برای اطمینان از اهمیت آماری تحلیل‌های بیشتر است و آرای ارزیاب را به عنوان توزیع در جمعیت‌های مختلف رمزگذاری می‌کند تا امکان کاوش عمیق در استراتژی‌های تجمیع رتبه‌بندی مختلف را فراهم کند.

این مجموعه داده برای مشاهده و اندازه گیری واریانس، ابهام و تنوع در زمینه ایمنی هوش مصنوعی مکالمه به خوبی مناسب است. مجموعه داده‌ها با مقاله‌ای همراه است که مجموعه‌ای از معیارها را توصیف می‌کند که نشان می‌دهد چگونه تنوع ارزیابی‌کننده بر درک ایمنی ارزیاب‌ها از مناطق جغرافیایی، گروه‌های قومی، گروه‌های سنی و جنسیت‌های مختلف تأثیر می‌گذارد. هدف مجموعه داده DICES این است که به عنوان یک معیار مشترک برای ارزیابی ایمنی سیستم‌های هوش مصنوعی محاوره‌ای استفاده شود.

اخطار محتوا : این مجموعه داده حاوی نمونه‌هایی از مکالمات متخاصم است که ممکن است توهین‌آمیز باشد.

صفحه اصلی : https://github.com/google-research-datasets/dices-dataset
کد منبع : tfds.datasets.dices.Builder
نسخه ها :
- 1.0.0 (پیش فرض): انتشار اولیه.
کلیدهای نظارت شده (به as_supervised doc مراجعه کنید): None
شکل ( tfds.show_examples ): پشتیبانی نمی شود.
نقل قول :

@article{aroyo2024dices,
  title={ {DICES} dataset: Diversity in conversational {AI} evaluation for safety},
  author={Aroyo, Lora and Taylor, Alex and Diaz, Mark and Homan, Christopher and Parrish, Alicia and Serapio-Garc{\'\i}a, Gregory and Prabhakaran, Vinodkumar and Wang, Ding},
  journal={Advances in Neural Information Processing Systems},
  volume={36},
  year={2024}
}

dices/350 (پیکربندی پیش فرض)

توضیحات پیکربندی : مجموعه داده 350 شامل 350 مکالمه است که توسط یک گروه ارزیابی متنوع از 123 رتبه‌دهنده منحصربه‌فرد رتبه‌بندی شده‌اند. هر مکالمه با پنج دسته سطح بالای ایمنی و یک سوال درک کلی از مکالمه رتبه بندی می شود. ارزیاب‌ها بر اساس جنسیت (مرد یا زن)، نژاد/قومیت (سفید، سیاه، لاتین، آسیایی، چند نژادی) متعادل بودند و هر رتبه‌بندی به همه موارد امتیاز داد. هر رتبه‌بندی به همه مکالمات امتیاز داد. هر مکالمه دارای 123 رتبه بندی منحصر به فرد است. تعداد کل ردیف های این مجموعه داده 43050 است.
حجم دانلود : 29.70 MiB
حجم مجموعه داده : 74.43 MiB
ذخیره خودکار ( اسناد ): بله
تقسیمات :

تقسیم کنید	نمونه ها
`'train'`	43050

ساختار ویژگی :

FeaturesDict({
    'Q1_whole_conversation_evaluation': ClassLabel(shape=(), dtype=int64, num_classes=4),
    'Q2_harmful_content_dangerous_content': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_derogation_of_boat': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_insensitive': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_medical_legal_financial_or_relationship_advice': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_obscene_and_profane': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_other': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_overall': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_personal_information': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_regulated_goods': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_sexually_suggestive_content': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_other': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_overall': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_targeting_beliefs': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_targeting_gender_sexual_orientation': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_targeting_inherited_attributes': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_targeting_status': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q4_misinformation': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q5_political_affiliation': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q6_policy_guidelines_making_endorsement': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q6_policy_guidelines_other': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q6_policy_guidelines_overall': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q6_policy_guidelines_polarizing_topics': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q_overall': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'answer_time_ms': Scalar(shape=(), dtype=int64, description=Amount of time spent by each rater on each safety annotation question.),
    'answer_timestamp': Scalar(shape=(), dtype=int64, description=Time when each conversation was rated by each rater.),
    'context': Text(shape=(), dtype=string),
    'degree_of_harm': ClassLabel(shape=(), dtype=int64, num_classes=5),
    'harm_type': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=32)),
    'id': Scalar(shape=(), dtype=int64, description=Numerical identifier for each row, representing all ratings by a single rater to a single conversation.),
    'item_id': Scalar(shape=(), dtype=int64, description=Numerical identifier for each conversation.),
    'phase': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'rater_age': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'rater_education': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'rater_gender': ClassLabel(shape=(), dtype=int64, num_classes=4),
    'rater_id': Scalar(shape=(), dtype=int64, description=Numerical identifier for each rater.),
    'rater_race': ClassLabel(shape=(), dtype=int64, num_classes=6),
    'rater_raw_race': Text(shape=(), dtype=string),
    'response': Text(shape=(), dtype=string),
    'safety_gold': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'safety_gold_reason': Text(shape=(), dtype=string),
})

مستندات ویژگی :

ویژگی	کلاس	شکل	نوع D	توضیحات
	FeaturesDict
Q1_hole_conversation_evaluation	ClassLabel		int64	رتبه بندی در مورد قابل درک بودن یک مکالمه.
Q2_محتوای_مضر_محتوای_خطرناک	ClassLabel		int64
Q2_محتوای_مضر_تحریف_قایق	ClassLabel		int64
Q2_محتوای_مضر_غیر حساس	ClassLabel		int64
Q2_محتوای_مضر_پزشکی_قانونی_مالی_یا_مشاوره_روابط	ClassLabel		int64
Q2_محتوای_مضر_ زشت_و_مزه	ClassLabel		int64
Q2_harmful_content_other	ClassLabel		int64
Q2_harmful_content_all	ClassLabel		int64
Q2_محتوای_مضر_اطلاعات_شخصی	ClassLabel		int64
Q2_محتوای_مضر_قانونی_قانونی	ClassLabel		int64
Q2_محتوای_مضر_جنسی_محتوای_تلقین کننده	ClassLabel		int64
Q3_bias_other	ClassLabel		int64
Q3_bias_all	ClassLabel		int64
Q3_bias_targeting_Beliefs	ClassLabel		int64
Q3_bias_targeting_gender_sexual orientation	ClassLabel		int64
Q3_bias_targeting_inherited_attributes	ClassLabel		int64
Q3_bias_targeting_status	ClassLabel		int64
Q4_اطلاعات غلط	ClassLabel		int64
Q5_وابستگی_سیاسی	ClassLabel		int64
Q6_policy_guidelines_making_andorsement	ClassLabel		int64
Q6_policy_guidelines_ other	ClassLabel		int64
Q6_policy_guidelines_all	ClassLabel		int64
Q6_policy_guidelines_polarizing_topics	ClassLabel		int64
Q_به طور کلی	ClassLabel		int64
answer_time_ms	اسکالر		int64	مقدار زمان صرف شده توسط هر ارزیابی کننده برای هر سوال حاشیه نویسی ایمنی.
answer_timestamp	اسکالر		int64	زمانی که هر مکالمه توسط هر رتبه‌بندی رتبه‌بندی می‌شد.
زمینه	متن		رشته	مکالمه قبل از پاسخ نهایی چت بات می چرخد.
درجه_آسیب	ClassLabel		int64	رتبه بندی دستی از شدت خطر ایمنی.
نوع_ضرر	دنباله (ClassLabel)	(هیچ،)	int64	موضوع(های) آسیب مشروح با دست در گفتگو.
شناسه	اسکالر		int64	شناسه عددی برای هر ردیف، که همه رتبه‌بندی‌ها را توسط یک رتبه‌دهنده به یک مکالمه نشان می‌دهد.
item_id	اسکالر		int64	شناسه عددی برای هر مکالمه
فاز	ClassLabel		int64	یکی از سه دوره زمانی متمایز.
rater_age	ClassLabel		int64	گروه سنی ارزیاب.
rater_education	ClassLabel		int64	تحصیلات ارزیاب.
rater_gender	ClassLabel		int64	جنسیت ارزیاب.
rater_id	اسکالر		int64	شناسه عددی برای هر ارزیاب
rater_race	ClassLabel		int64	نژاد/قومیت ارزیاب.
rater_raw_race	متن		رشته	نژاد/قومیت خام ارزیابی‌کننده، قبل از ساده‌سازی به پنج دسته.
پاسخ	متن		رشته	پاسخ نهایی چت بات در مکالمه.
ایمنی_طلا	ClassLabel		int64	برچسب ایمنی استاندارد طلایی ارائه شده توسط کارشناسان.
ایمنی_طلا_دلیل	متن		رشته	دلیل(های) (در صورت ارائه) برچسب ایمنی طلا که توسط کارشناسان ارائه شده است.

مثال‌ها ( tfds.as_dataframe ):

تاس/990

توضیحات پیکربندی : مجموعه داده 990 شامل 990 مکالمه است که توسط یک گروه ارزیابی متنوع از 173 رتبه‌دهنده منحصربه‌فرد رتبه‌بندی شده‌اند. هر مکالمه با سه دسته سطح بالای ایمنی و یک سؤال کلی در مورد مکالمه رتبه بندی می شود. ارزیاب‌ها به‌گونه‌ای انتخاب شدند که تعداد رتبه‌دهندگان برای هر مکالمه بر اساس جنسیت (مرد، زن) و منطقه (ایالات متحده، هند) متعادل شود. هر ارزیاب فقط به نمونه ای از مکالمه امتیاز داد. هر مکالمه دارای 60-70 رتبه بندی منحصر به فرد است. تعداد کل ردیف های این مجموعه داده 72103 است.
حجم دانلود : 48.06 MiB
حجم مجموعه داده : 150.38 MiB
ذخیره خودکار ( مستندات ): فقط زمانی که shuffle_files=False (قطار)
تقسیمات :

تقسیم کنید	نمونه ها
`'train'`	72,103

ساختار ویژگی :

FeaturesDict({
    'Q1_whole_conversation_evaluation': ClassLabel(shape=(), dtype=int64, num_classes=4),
    'Q2_harmful_content_explicitly_sexual': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_financial_advice': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_harm_to_people_or_animals': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_health_and_safety_claims': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_how_tos_about_harmful_activities': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_obscenities_or_profanities': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_other': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_overall': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_personal_information': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_promotes_or_condones_use_of_harmful_regulated_goods': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_promotes_or_condones_violence': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q2_harmful_content_violent_or_gory': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_3_Racial_religious_or_other_slurs_and_stereotypes_that_incites_or_promote_hatred_based_on_any_of_the_attributes_noted_above': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_calling_individuals_or_groups_evil_corrupt_or_malicious': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_incites_hatred': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_insults_about_emotional_romantic_and_or_sexual_attraction': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_other': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_physically_or_mentally_inferior': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_subhuman_comparisons': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_bias_superiority_or_subjugation_of_a_group': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q3_unfair_bias_overall': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q4_misinformation_conspiracy_theories': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q4_misinformation_contradicts_expert_consensus': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q4_misinformation_false_theories': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q4_misinformation_other': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q4_misinformation_overall': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q4_misinformation_political': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'Q_overall': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'context': Text(shape=(), dtype=string),
    'degree_of_harm': ClassLabel(shape=(), dtype=int64, num_classes=5),
    'harm_type': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=32)),
    'id': Scalar(shape=(), dtype=int64, description=Numerical identifier for each row, representing all ratings by a single rater to a single conversation.),
    'item_id': Scalar(shape=(), dtype=int64, description=Numerical identifier for each conversation.),
    'phase': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'rater_age': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'rater_education': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'rater_gender': ClassLabel(shape=(), dtype=int64, num_classes=4),
    'rater_id': Scalar(shape=(), dtype=int64, description=Numerical identifier for each rater.),
    'rater_locale': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'rater_race': ClassLabel(shape=(), dtype=int64, num_classes=6),
    'rater_raw_race': Text(shape=(), dtype=string),
    'response': Text(shape=(), dtype=string),
})

مستندات ویژگی :

ویژگی	کلاس	شکل	نوع D	توضیحات
	FeaturesDict
Q1_hole_conversation_evaluation	ClassLabel		int64	رتبه بندی در مورد قابل درک بودن یک مکالمه.
Q2_محتوای_مضر_به طور صریح_جنسی	ClassLabel		int64
Q2_محتوای_مضر_مشاوره_مالی	ClassLabel		int64
Q2_محتوای_مضر_آزار_به_مردم_یا_حیوانات	ClassLabel		int64
Q2_محتوای_مضر_ادعاهای_سلامتی_و_ایمنی	ClassLabel		int64
Q2_harmful_content_how_tos_about_harmful_activities	ClassLabel		int64
Q2_محتوای_مضر_فحاشی_یا_بدحجابی	ClassLabel		int64
Q2_harmful_content_other	ClassLabel		int64
Q2_harmful_content_all	ClassLabel		int64
Q2_محتوای_مضر_اطلاعات_شخصی	ClassLabel		int64
Q2_harmful_content_promotes_or_condones_use_of_harmful_regulated_goods	ClassLabel		int64
Q2_harmful_content_promotes_or_condones_violence	ClassLabel		int64
Q2_Harmful_content_violent_or_gory	ClassLabel		int64
Q3_3_نژادی_مذهبی_یا_سایر_توهین_و_کلیشه_بر اساس_هر_یک_از_ویژگی_های_ذکر_شده_بالا_برانگیخته_یا_ترویج_نفرت_اند.	ClassLabel		int64
Q3_bias_calling_individuals_or_groups_evil_corrupt_or_malicious	ClassLabel		int64
Q3_ تعصب_برانگیختن_نفرت	ClassLabel		int64
Q3_سوگیری_توهین_درباره_جذابیت_عاطفی_عاشقانه_و_یا_جنسی	ClassLabel		int64
Q3_bias_other	ClassLabel		int64
Q3_تعصب_فیزیکی_یا_ذهنی_فرهنگی	ClassLabel		int64
Q3_bias_subhuman_comparisons	ClassLabel		int64
Q3_تعصب_برتری_یا_انقیاد_یک_گروه	ClassLabel		int64
Q3_unfair_bias_all	ClassLabel		int64
Q4_اطلاعات_نادرست_نظریه_توطئه	ClassLabel		int64
Q4_اطلاعات غلط_تضاد_اجماع_کارشناسان	ClassLabel		int64
Q4_اطلاعات_نادرست_نظریه_های_کاذب	ClassLabel		int64
Q4_اطلاعات غلط_دیگر	ClassLabel		int64
Q4_اطلاعات غلط_کلی	ClassLabel		int64
Q4_اطلاعات غلط_سیاسی	ClassLabel		int64
Q_به طور کلی	ClassLabel		int64
زمینه	متن		رشته	مکالمه قبل از پاسخ نهایی چت بات می چرخد.
درجه_آسیب	ClassLabel		int64	رتبه بندی دستی از شدت خطر ایمنی.
نوع_ضرر	دنباله (ClassLabel)	(هیچ،)	int64	موضوع(های) آسیب مشروح با دست در گفتگو.
شناسه	اسکالر		int64	شناسه عددی برای هر ردیف، که همه رتبه‌بندی‌ها را توسط یک رتبه‌دهنده به یک مکالمه نشان می‌دهد.
item_id	اسکالر		int64	شناسه عددی برای هر مکالمه
فاز	ClassLabel		int64	یکی از سه دوره زمانی متمایز.
rater_age	ClassLabel		int64	گروه سنی ارزیاب.
rater_education	ClassLabel		int64	تحصیلات ارزیاب.
rater_gender	ClassLabel		int64	جنسیت ارزیاب.
rater_id	اسکالر		int64	شناسه عددی برای هر ارزیاب
رتبه_محلی	ClassLabel		int64	محل ارزیابی.
rater_race	ClassLabel		int64	نژاد/قومیت ارزیاب.
rater_raw_race	متن		رشته	نژاد/قومیت خام ارزیابی‌کننده، قبل از ساده‌سازی به پنج دسته.
پاسخ	متن		رشته	پاسخ نهایی چت بات در مکالمه.

مثال‌ها ( tfds.as_dataframe ):

تاس با مجموعه‌ها، منظم بمانید ذخیره و طبقه‌بندی محتوا براساس اولویت‌های شما.

مجموعه داده های تنوع در ارزیابی هوش مصنوعی مکالمه برای ایمنی ( DICES ).

dices/350 (پیکربندی پیش فرض)

تاس/990

تاس