wiki_lingua

Referanslar:

Arapça

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/arabic')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 9995
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Çince

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/chinese')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 6541
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Çek

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/czech')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 2520
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Flemenkçe

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/dutch')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 10862
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

İngilizce

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/english')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 57945
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Fransızca

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/french')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 21690
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Almanca

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/german')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 20103
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Hintçe

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/hindi')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 3402
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Endonezya dili

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/indonesian')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 16308
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

İtalyan

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/italian')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 17673
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Japonca

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/japanese')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 4372
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Koreli

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/korean')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 4111
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Portekizce

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/portuguese')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 28143
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Rusça

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/russian')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 18143
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

İspanyol

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/spanish')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 38795
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

taylandlı

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/thai')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 5093
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Türkçe

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/turkish')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 1512
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Vietnam

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:wiki_lingua/vietnamese')
  • Tanım :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisans : CC BY-NC-SA 3.0
  • Sürüm : 1.1.1
  • Bölünmeler :
Bölmek Örnekler
'train' 6616
  • Özellikler :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}