トルコ人

参考文献:

単純化

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:turk/simplification')
  • 説明
TURKCorpus is a dataset for evaluating sentence simplification systems that focus on lexical paraphrasing,
as described in "Optimizing Statistical Machine Translation for Text Simplification". The corpus is composed of 2000 validation and 359 test original sentences that were each simplified 8 times by different annotators.
  • ライセンス: GNU 一般公衆利用許諾書 v3.0
  • バージョン: 1.0.0
  • 分割:
スプリット
'test' 359
'validation' 2000年
  • 特徴
{
    "original": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simplifications": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}