Thanks for tuning in to Google I/O. View all sessions on demandWatch on demand

wikipedia

  • Description:

Wikipedia dataset containing cleaned articles of all languages. The datasets are built from the Wikipedia dump (https://dumps.wikimedia.org/) with one split per language. Each example contains the content of one full Wikipedia article with cleaning to strip markdown and unwanted sections (references, etc.).

FeaturesDict({
    'text': Text(shape=(), dtype=string),
    'title': Text(shape=(), dtype=string),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
text Text string
title Text string
@ONLINE {wikidump,
    author = "Wikimedia Foundation",
    title  = "Wikimedia Downloads",
    url    = "https://dumps.wikimedia.org"
}

wikipedia/20230201.aa (default config)

  • Config description: Wikipedia dataset for aa, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ab

  • Config description: Wikipedia dataset for ab, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ace

  • Config description: Wikipedia dataset for ace, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ady

  • Config description: Wikipedia dataset for ady, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.af

  • Config description: Wikipedia dataset for af, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ak

  • Config description: Wikipedia dataset for ak, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.als

  • Config description: Wikipedia dataset for als, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.am

  • Config description: Wikipedia dataset for am, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.an

  • Config description: Wikipedia dataset for an, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ang

  • Config description: Wikipedia dataset for ang, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ar

  • Config description: Wikipedia dataset for ar, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.arc

  • Config description: Wikipedia dataset for arc, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.arz

  • Config description: Wikipedia dataset for arz, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.as

  • Config description: Wikipedia dataset for as, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ast

  • Config description: Wikipedia dataset for ast, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.atj

  • Config description: Wikipedia dataset for atj, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.av

  • Config description: Wikipedia dataset for av, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ay

  • Config description: Wikipedia dataset for ay, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.az

  • Config description: Wikipedia dataset for az, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.azb

  • Config description: Wikipedia dataset for azb, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ba

  • Config description: Wikipedia dataset for ba, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.bar

  • Config description: Wikipedia dataset for bar, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.bcl

  • Config description: Wikipedia dataset for bcl, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.be

  • Config description: Wikipedia dataset for be, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.bg

  • Config description: Wikipedia dataset for bg, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.bh

  • Config description: Wikipedia dataset for bh, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.bi

  • Config description: Wikipedia dataset for bi, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.bjn

  • Config description: Wikipedia dataset for bjn, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.bm

  • Config description: Wikipedia dataset for bm, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.bn

  • Config description: Wikipedia dataset for bn, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.bo

  • Config description: Wikipedia dataset for bo, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.bpy

  • Config description: Wikipedia dataset for bpy, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.br

  • Config description: Wikipedia dataset for br, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.bs

  • Config description: Wikipedia dataset for bs, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.bug

  • Config description: Wikipedia dataset for bug, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.bxr

  • Config description: Wikipedia dataset for bxr, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ca

  • Config description: Wikipedia dataset for ca, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.cdo

  • Config description: Wikipedia dataset for cdo, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ce

  • Config description: Wikipedia dataset for ce, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ceb

  • Config description: Wikipedia dataset for ceb, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ch

  • Config description: Wikipedia dataset for ch, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.cho

  • Config description: Wikipedia dataset for cho, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.chr

  • Config description: Wikipedia dataset for chr, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.chy

  • Config description: Wikipedia dataset for chy, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ckb

  • Config description: Wikipedia dataset for ckb, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.co

  • Config description: Wikipedia dataset for co, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.cr

  • Config description: Wikipedia dataset for cr, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.crh

  • Config description: Wikipedia dataset for crh, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.cs

  • Config description: Wikipedia dataset for cs, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.csb

  • Config description: Wikipedia dataset for csb, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.cu

  • Config description: Wikipedia dataset for cu, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.cv

  • Config description: Wikipedia dataset for cv, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.cy

  • Config description: Wikipedia dataset for cy, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.da

  • Config description: Wikipedia dataset for da, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.de

  • Config description: Wikipedia dataset for de, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.din

  • Config description: Wikipedia dataset for din, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.diq

  • Config description: Wikipedia dataset for diq, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.dsb

  • Config description: Wikipedia dataset for dsb, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.dty

  • Config description: Wikipedia dataset for dty, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.dv

  • Config description: Wikipedia dataset for dv, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.dz

  • Config description: Wikipedia dataset for dz, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ee

  • Config description: Wikipedia dataset for ee, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.el

  • Config description: Wikipedia dataset for el, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.eml

  • Config description: Wikipedia dataset for eml, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.en

  • Config description: Wikipedia dataset for en, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.eo

  • Config description: Wikipedia dataset for eo, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.es

  • Config description: Wikipedia dataset for es, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.et

  • Config description: Wikipedia dataset for et, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.eu

  • Config description: Wikipedia dataset for eu, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ext

  • Config description: Wikipedia dataset for ext, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.fa

  • Config description: Wikipedia dataset for fa, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ff

  • Config description: Wikipedia dataset for ff, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.fi

  • Config description: Wikipedia dataset for fi, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.fj

  • Config description: Wikipedia dataset for fj, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.fo

  • Config description: Wikipedia dataset for fo, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.fr

  • Config description: Wikipedia dataset for fr, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.frp

  • Config description: Wikipedia dataset for frp, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.frr

  • Config description: Wikipedia dataset for frr, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.fur

  • Config description: Wikipedia dataset for fur, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.fy

  • Config description: Wikipedia dataset for fy, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ga

  • Config description: Wikipedia dataset for ga, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.gag

  • Config description: Wikipedia dataset for gag, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.gan

  • Config description: Wikipedia dataset for gan, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.gd

  • Config description: Wikipedia dataset for gd, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.gl

  • Config description: Wikipedia dataset for gl, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.glk

  • Config description: Wikipedia dataset for glk, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.gn

  • Config description: Wikipedia dataset for gn, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.gom

  • Config description: Wikipedia dataset for gom, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.gor

  • Config description: Wikipedia dataset for gor, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.got

  • Config description: Wikipedia dataset for got, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.gu

  • Config description: Wikipedia dataset for gu, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.gv

  • Config description: Wikipedia dataset for gv, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ha

  • Config description: Wikipedia dataset for ha, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.hak

  • Config description: Wikipedia dataset for hak, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.haw

  • Config description: Wikipedia dataset for haw, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.he

  • Config description: Wikipedia dataset for he, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.hi

  • Config description: Wikipedia dataset for hi, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.hif

  • Config description: Wikipedia dataset for hif, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ho

  • Config description: Wikipedia dataset for ho, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.hr

  • Config description: Wikipedia dataset for hr, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.hsb

  • Config description: Wikipedia dataset for hsb, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ht

  • Config description: Wikipedia dataset for ht, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.hu

  • Config description: Wikipedia dataset for hu, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.hy

  • Config description: Wikipedia dataset for hy, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ia

  • Config description: Wikipedia dataset for ia, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.id

  • Config description: Wikipedia dataset for id, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ie

  • Config description: Wikipedia dataset for ie, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ig

  • Config description: Wikipedia dataset for ig, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ii

  • Config description: Wikipedia dataset for ii, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ik

  • Config description: Wikipedia dataset for ik, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ilo

  • Config description: Wikipedia dataset for ilo, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.inh

  • Config description: Wikipedia dataset for inh, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.io

  • Config description: Wikipedia dataset for io, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.is

  • Config description: Wikipedia dataset for is, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.it

  • Config description: Wikipedia dataset for it, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.iu

  • Config description: Wikipedia dataset for iu, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ja

  • Config description: Wikipedia dataset for ja, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.jam

  • Config description: Wikipedia dataset for jam, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.jbo

  • Config description: Wikipedia dataset for jbo, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.jv

  • Config description: Wikipedia dataset for jv, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ka

  • Config description: Wikipedia dataset for ka, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.kaa

  • Config description: Wikipedia dataset for kaa, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.kab

  • Config description: Wikipedia dataset for kab, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.kbd

  • Config description: Wikipedia dataset for kbd, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.kbp

  • Config description: Wikipedia dataset for kbp, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.kg

  • Config description: Wikipedia dataset for kg, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ki

  • Config description: Wikipedia dataset for ki, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.kj

  • Config description: Wikipedia dataset for kj, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.kk

  • Config description: Wikipedia dataset for kk, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.kl

  • Config description: Wikipedia dataset for kl, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.km

  • Config description: Wikipedia dataset for km, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.kn

  • Config description: Wikipedia dataset for kn, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ko

  • Config description: Wikipedia dataset for ko, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.koi

  • Config description: Wikipedia dataset for koi, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.krc

  • Config description: Wikipedia dataset for krc, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ks

  • Config description: Wikipedia dataset for ks, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ksh

  • Config description: Wikipedia dataset for ksh, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ku

  • Config description: Wikipedia dataset for ku, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.kv

  • Config description: Wikipedia dataset for kv, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.kw

  • Config description: Wikipedia dataset for kw, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ky

  • Config description: Wikipedia dataset for ky, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.la

  • Config description: Wikipedia dataset for la, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.lad

  • Config description: Wikipedia dataset for lad, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.lb

  • Config description: Wikipedia dataset for lb, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.lbe

  • Config description: Wikipedia dataset for lbe, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.lez

  • Config description: Wikipedia dataset for lez, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.lfn

  • Config description: Wikipedia dataset for lfn, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.lg

  • Config description: Wikipedia dataset for lg, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.li

  • Config description: Wikipedia dataset for li, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.lij

  • Config description: Wikipedia dataset for lij, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.lmo

  • Config description: Wikipedia dataset for lmo, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ln

  • Config description: Wikipedia dataset for ln, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.lo

  • Config description: Wikipedia dataset for lo, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.lrc

  • Config description: Wikipedia dataset for lrc, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.lt

  • Config description: Wikipedia dataset for lt, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ltg

  • Config description: Wikipedia dataset for ltg, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.lv

  • Config description: Wikipedia dataset for lv, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.mai

  • Config description: Wikipedia dataset for mai, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.mdf

  • Config description: Wikipedia dataset for mdf, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.mg

  • Config description: Wikipedia dataset for mg, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.mh

  • Config description: Wikipedia dataset for mh, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.mhr

  • Config description: Wikipedia dataset for mhr, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.mi

  • Config description: Wikipedia dataset for mi, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.min

  • Config description: Wikipedia dataset for min, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.mk

  • Config description: Wikipedia dataset for mk, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ml

  • Config description: Wikipedia dataset for ml, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.mn

  • Config description: Wikipedia dataset for mn, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.mr

  • Config description: Wikipedia dataset for mr, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.mrj

  • Config description: Wikipedia dataset for mrj, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ms

  • Config description: Wikipedia dataset for ms, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.mt

  • Config description: Wikipedia dataset for mt, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.mus

  • Config description: Wikipedia dataset for mus, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.mwl

  • Config description: Wikipedia dataset for mwl, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.my

  • Config description: Wikipedia dataset for my, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.myv

  • Config description: Wikipedia dataset for myv, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.mzn

  • Config description: Wikipedia dataset for mzn, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.na

  • Config description: Wikipedia dataset for na, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.nah

  • Config description: Wikipedia dataset for nah, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.nap

  • Config description: Wikipedia dataset for nap, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.nds

  • Config description: Wikipedia dataset for nds, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ne

  • Config description: Wikipedia dataset for ne, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.new

  • Config description: Wikipedia dataset for new, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ng

  • Config description: Wikipedia dataset for ng, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.nl

  • Config description: Wikipedia dataset for nl, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.nn

  • Config description: Wikipedia dataset for nn, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.no

  • Config description: Wikipedia dataset for no, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.nov

  • Config description: Wikipedia dataset for nov, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.nrm

  • Config description: Wikipedia dataset for nrm, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.nso

  • Config description: Wikipedia dataset for nso, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.nv

  • Config description: Wikipedia dataset for nv, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ny

  • Config description: Wikipedia dataset for ny, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.oc

  • Config description: Wikipedia dataset for oc, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.olo

  • Config description: Wikipedia dataset for olo, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.om

  • Config description: Wikipedia dataset for om, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.or

  • Config description: Wikipedia dataset for or, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.os

  • Config description: Wikipedia dataset for os, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.pa

  • Config description: Wikipedia dataset for pa, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.pag

  • Config description: Wikipedia dataset for pag, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.pam

  • Config description: Wikipedia dataset for pam, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.pap

  • Config description: Wikipedia dataset for pap, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.pcd

  • Config description: Wikipedia dataset for pcd, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.pdc

  • Config description: Wikipedia dataset for pdc, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.pfl

  • Config description: Wikipedia dataset for pfl, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.pi

  • Config description: Wikipedia dataset for pi, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.pih

  • Config description: Wikipedia dataset for pih, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.pl

  • Config description: Wikipedia dataset for pl, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.pms

  • Config description: Wikipedia dataset for pms, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.pnb

  • Config description: Wikipedia dataset for pnb, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.pnt

  • Config description: Wikipedia dataset for pnt, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ps

  • Config description: Wikipedia dataset for ps, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.pt

  • Config description: Wikipedia dataset for pt, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.qu

  • Config description: Wikipedia dataset for qu, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.rm

  • Config description: Wikipedia dataset for rm, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.rmy

  • Config description: Wikipedia dataset for rmy, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.rn

  • Config description: Wikipedia dataset for rn, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ro

  • Config description: Wikipedia dataset for ro, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ru

  • Config description: Wikipedia dataset for ru, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.rue

  • Config description: Wikipedia dataset for rue, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.rw

  • Config description: Wikipedia dataset for rw, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.sa

  • Config description: Wikipedia dataset for sa, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.sah

  • Config description: Wikipedia dataset for sah, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.sat

  • Config description: Wikipedia dataset for sat, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.sc

  • Config description: Wikipedia dataset for sc, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.scn

  • Config description: Wikipedia dataset for scn, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.sco

  • Config description: Wikipedia dataset for sco, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.sd

  • Config description: Wikipedia dataset for sd, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.se

  • Config description: Wikipedia dataset for se, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.sg

  • Config description: Wikipedia dataset for sg, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.sh

  • Config description: Wikipedia dataset for sh, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.si

  • Config description: Wikipedia dataset for si, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.simple

  • Config description: Wikipedia dataset for simple, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.sk

  • Config description: Wikipedia dataset for sk, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.sl

  • Config description: Wikipedia dataset for sl, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.sm

  • Config description: Wikipedia dataset for sm, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.sn

  • Config description: Wikipedia dataset for sn, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.so

  • Config description: Wikipedia dataset for so, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.sq

  • Config description: Wikipedia dataset for sq, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.sr

  • Config description: Wikipedia dataset for sr, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.srn

  • Config description: Wikipedia dataset for srn, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ss

  • Config description: Wikipedia dataset for ss, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.st

  • Config description: Wikipedia dataset for st, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.stq

  • Config description: Wikipedia dataset for stq, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.su

  • Config description: Wikipedia dataset for su, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.sv

  • Config description: Wikipedia dataset for sv, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.sw

  • Config description: Wikipedia dataset for sw, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.szl

  • Config description: Wikipedia dataset for szl, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ta

  • Config description: Wikipedia dataset for ta, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.tcy

  • Config description: Wikipedia dataset for tcy, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.te

  • Config description: Wikipedia dataset for te, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.tet

  • Config description: Wikipedia dataset for tet, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.tg

  • Config description: Wikipedia dataset for tg, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.th

  • Config description: Wikipedia dataset for th, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ti

  • Config description: Wikipedia dataset for ti, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.tk

  • Config description: Wikipedia dataset for tk, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.tl

  • Config description: Wikipedia dataset for tl, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.tn

  • Config description: Wikipedia dataset for tn, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.to

  • Config description: Wikipedia dataset for to, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.tpi

  • Config description: Wikipedia dataset for tpi, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.tr

  • Config description: Wikipedia dataset for tr, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ts

  • Config description: Wikipedia dataset for ts, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.tt

  • Config description: Wikipedia dataset for tt, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.tum

  • Config description: Wikipedia dataset for tum, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.tw

  • Config description: Wikipedia dataset for tw, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ty

  • Config description: Wikipedia dataset for ty, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.tyv

  • Config description: Wikipedia dataset for tyv, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.udm

  • Config description: Wikipedia dataset for udm, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ug

  • Config description: Wikipedia dataset for ug, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.uk

  • Config description: Wikipedia dataset for uk, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ur

  • Config description: Wikipedia dataset for ur, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.uz

  • Config description: Wikipedia dataset for uz, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.ve

  • Config description: Wikipedia dataset for ve, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.vec

  • Config description: Wikipedia dataset for vec, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.vep

  • Config description: Wikipedia dataset for vep, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.vi

  • Config description: Wikipedia dataset for vi, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.vls

  • Config description: Wikipedia dataset for vls, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.vo

  • Config description: Wikipedia dataset for vo, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.wa

  • Config description: Wikipedia dataset for wa, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.war

  • Config description: Wikipedia dataset for war, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.wo

  • Config description: Wikipedia dataset for wo, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.wuu

  • Config description: Wikipedia dataset for wuu, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.xal

  • Config description: Wikipedia dataset for xal, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.xh

  • Config description: Wikipedia dataset for xh, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.xmf

  • Config description: Wikipedia dataset for xmf, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.yi

  • Config description: Wikipedia dataset for yi, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.yo

  • Config description: Wikipedia dataset for yo, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.za

  • Config description: Wikipedia dataset for za, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.zea

  • Config description: Wikipedia dataset for zea, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.zh

  • Config description: Wikipedia dataset for zh, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20230201.zu

  • Config description: Wikipedia dataset for zu, parsed from 20230201 dump.

  • Download size: Unknown size

  • Dataset size: Unknown size

  • Auto-cached (documentation): Unknown

  • Splits:

Split Examples

wikipedia/20220620.aa

  • Config description: Wikipedia dataset for aa, parsed from 20220620 dump.

  • Download size: 45.22 KiB

  • Dataset size: 3.46 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1

wikipedia/20220620.ab

  • Config description: Wikipedia dataset for ab, parsed from 20220620 dump.

  • Download size: 2.39 MiB

  • Dataset size: 3.81 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 7,397

wikipedia/20220620.ace

  • Config description: Wikipedia dataset for ace, parsed from 20220620 dump.

  • Download size: 3.37 MiB

  • Dataset size: 4.27 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 13,778

wikipedia/20220620.ady

  • Config description: Wikipedia dataset for ady, parsed from 20220620 dump.

  • Download size: 1004.23 KiB

  • Dataset size: 522.19 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 582

wikipedia/20220620.af

  • Config description: Wikipedia dataset for af, parsed from 20220620 dump.

  • Download size: 122.32 MiB

  • Dataset size: 207.23 MiB

  • Auto-cached (documentation): Only when shuffle_files=False (train)

  • Splits:

Split Examples
'train' 126,990

wikipedia/20220620.ak

  • Config description: Wikipedia dataset for ak, parsed from 20220620 dump.

  • Download size: 618.99 KiB

  • Dataset size: 761.42 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 688

wikipedia/20220620.als

  • Config description: Wikipedia dataset for als, parsed from 20220620 dump.

  • Download size: 55.96 MiB

  • Dataset size: 74.98 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 31,316

wikipedia/20220620.am

  • Config description: Wikipedia dataset for am, parsed from 20220620 dump.

  • Download size: 7.91 MiB

  • Dataset size: 20.55 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 13,701

wikipedia/20220620.an

  • Config description: Wikipedia dataset for an, parsed from 20220620 dump.

  • Download size: 37.72 MiB

  • Dataset size: 52.60 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 57,562

wikipedia/20220620.ang

  • Config description: Wikipedia dataset for ang, parsed from 20220620 dump.

  • Download size: 4.52 MiB

  • Dataset size: 2.59 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 3,649

wikipedia/20220620.ar

  • Config description: Wikipedia dataset for ar, parsed from 20220620 dump.

  • Download size: 1.44 GiB

  • Dataset size: 2.78 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 2,179,406

wikipedia/20220620.arc

  • Config description: Wikipedia dataset for arc, parsed from 20220620 dump.

  • Download size: 1.12 MiB

  • Dataset size: 868.77 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 3,598

wikipedia/20220620.arz

  • Config description: Wikipedia dataset for arz, parsed from 20220620 dump.

  • Download size: 216.55 MiB

  • Dataset size: 1.15 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 1,601,331

wikipedia/20220620.as

  • Config description: Wikipedia dataset for as, parsed from 20220620 dump.

  • Download size: 32.06 MiB

  • Dataset size: 70.52 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 10,548

wikipedia/20220620.ast

  • Config description: Wikipedia dataset for ast, parsed from 20220620 dump.

  • Download size: 221.35 MiB

  • Dataset size: 456.76 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 141,302

wikipedia/20220620.atj

  • Config description: Wikipedia dataset for atj, parsed from 20220620 dump.

  • Download size: 692.29 KiB

  • Dataset size: 351.44 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 535

wikipedia/20220620.av

  • Config description: Wikipedia dataset for av, parsed from 20220620 dump.

  • Download size: 7.08 MiB

  • Dataset size: 4.89 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 3,692

wikipedia/20220620.ay

  • Config description: Wikipedia dataset for ay, parsed from 20220620 dump.

  • Download size: 2.47 MiB

  • Dataset size: 4.23 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 5,404

wikipedia/20220620.az

  • Config description: Wikipedia dataset for az, parsed from 20220620 dump.

  • Download size: 229.12 MiB

  • Dataset size: 383.51 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 218,285

wikipedia/20220620.azb

  • Config description: Wikipedia dataset for azb, parsed from 20220620 dump.

  • Download size: 97.93 MiB

  • Dataset size: 161.23 MiB

  • Auto-cached (documentation): Only when shuffle_files=False (train)

  • Splits:

Split Examples
'train' 269,716

wikipedia/20220620.ba

  • Config description: Wikipedia dataset for ba, parsed from 20220620 dump.

  • Download size: 88.71 MiB

  • Dataset size: 257.90 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 67,501

wikipedia/20220620.bar

  • Config description: Wikipedia dataset for bar, parsed from 20220620 dump.

  • Download size: 35.40 MiB

  • Dataset size: 41.95 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 47,548

wikipedia/20220620.bat-smg

  • Config description: Wikipedia dataset for bat-smg, parsed from 20220620 dump.

  • Download size: 5.05 MiB

  • Dataset size: 6.80 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 20,196

wikipedia/20220620.bcl

  • Config description: Wikipedia dataset for bcl, parsed from 20220620 dump.

  • Download size: 14.67 MiB

  • Dataset size: 14.82 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 12,387

wikipedia/20220620.be

  • Config description: Wikipedia dataset for be, parsed from 20220620 dump.

  • Download size: 252.43 MiB

  • Dataset size: 530.10 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 219,500

wikipedia/20220620.be-x-old

  • Config description: Wikipedia dataset for be-x-old, parsed from 20220620 dump.

  • Download size: 96.33 MiB

  • Dataset size: 213.09 MiB

  • Auto-cached (documentation): Only when shuffle_files=False (train)

  • Splits:

Split Examples
'train' 113,278

wikipedia/20220620.bg

  • Config description: Wikipedia dataset for bg, parsed from 20220620 dump.

  • Download size: 396.21 MiB

  • Dataset size: 992.48 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 407,551

wikipedia/20220620.bh

  • Config description: Wikipedia dataset for bh, parsed from 20220620 dump.

  • Download size: 16.47 MiB

  • Dataset size: 12.90 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 7,992

wikipedia/20220620.bi

  • Config description: Wikipedia dataset for bi, parsed from 20220620 dump.

  • Download size: 597.62 KiB

  • Dataset size: 333.71 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,492

wikipedia/20220620.bjn

  • Config description: Wikipedia dataset for bjn, parsed from 20220620 dump.

  • Download size: 4.93 MiB

  • Dataset size: 4.51 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 5,078

wikipedia/20220620.bm

  • Config description: Wikipedia dataset for bm, parsed from 20220620 dump.

  • Download size: 683.58 KiB

  • Dataset size: 394.17 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,179

wikipedia/20220620.bn

  • Config description: Wikipedia dataset for bn, parsed from 20220620 dump.

  • Download size: 291.30 MiB

  • Dataset size: 781.73 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 202,398

wikipedia/20220620.bo

  • Config description: Wikipedia dataset for bo, parsed from 20220620 dump.

  • Download size: 13.66 MiB

  • Dataset size: 119.54 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 11,891

wikipedia/20220620.bpy

  • Config description: Wikipedia dataset for bpy, parsed from 20220620 dump.

  • Download size: 5.27 MiB

  • Dataset size: 37.70 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 25,586

wikipedia/20220620.br

  • Config description: Wikipedia dataset for br, parsed from 20220620 dump.

  • Download size: 55.30 MiB

  • Dataset size: 77.98 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 83,938

wikipedia/20220620.bs

  • Config description: Wikipedia dataset for bs, parsed from 20220620 dump.

  • Download size: 138.13 MiB

  • Dataset size: 180.59 MiB

  • Auto-cached (documentation): Only when shuffle_files=False (train)

  • Splits:

Split Examples
'train' 202,982

wikipedia/20220620.bug

  • Config description: Wikipedia dataset for bug, parsed from 20220620 dump.

  • Download size: 2.11 MiB

  • Dataset size: 2.92 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 15,674

wikipedia/20220620.bxr

  • Config description: Wikipedia dataset for bxr, parsed from 20220620 dump.

  • Download size: 5.07 MiB

  • Dataset size: 6.39 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 3,592

wikipedia/20220620.ca

  • Config description: Wikipedia dataset for ca, parsed from 20220620 dump.

  • Download size: 1.04 GiB

  • Dataset size: 1.72 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 807,863

wikipedia/20220620.cbk-zam

  • Config description: Wikipedia dataset for cbk-zam, parsed from 20220620 dump.

  • Download size: 3.35 MiB

  • Dataset size: 2.82 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 4,568

wikipedia/20220620.cdo

  • Config description: Wikipedia dataset for cdo, parsed from 20220620 dump.

  • Download size: 4.71 MiB

  • Dataset size: 4.04 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 16,912

wikipedia/20220620.ce

  • Config description: Wikipedia dataset for ce, parsed from 20220620 dump.

  • Download size: 77.35 MiB

  • Dataset size: 432.07 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 479,298

wikipedia/20220620.ceb

  • Config description: Wikipedia dataset for ceb, parsed from 20220620 dump.

  • Download size: 2.04 GiB

  • Dataset size: 4.10 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 6,126,575

wikipedia/20220620.ch

  • Config description: Wikipedia dataset for ch, parsed from 20220620 dump.

  • Download size: 748.23 KiB

  • Dataset size: 167.50 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 565

wikipedia/20220620.cho

  • Config description: Wikipedia dataset for cho, parsed from 20220620 dump.

  • Download size: 26.95 KiB

  • Dataset size: 7.44 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 14

wikipedia/20220620.chr

  • Config description: Wikipedia dataset for chr, parsed from 20220620 dump.

  • Download size: 692.01 KiB

  • Dataset size: 680.68 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,081

wikipedia/20220620.chy

  • Config description: Wikipedia dataset for chy, parsed from 20220620 dump.

  • Download size: 383.76 KiB

  • Dataset size: 123.19 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 826

wikipedia/20220620.ckb

  • Config description: Wikipedia dataset for ckb, parsed from 20220620 dump.

  • Download size: 44.22 MiB

  • Dataset size: 76.65 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 50,725

wikipedia/20220620.co

  • Config description: Wikipedia dataset for co, parsed from 20220620 dump.

  • Download size: 4.98 MiB

  • Dataset size: 7.05 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 7,070

wikipedia/20220620.cr

  • Config description: Wikipedia dataset for cr, parsed from 20220620 dump.

  • Download size: 304.75 KiB

  • Dataset size: 36.79 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 180

wikipedia/20220620.crh

  • Config description: Wikipedia dataset for crh, parsed from 20220620 dump.

  • Download size: 6.95 MiB

  • Dataset size: 6.14 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 19,273

wikipedia/20220620.cs

  • Config description: Wikipedia dataset for cs, parsed from 20220620 dump.

  • Download size: 998.37 MiB

  • Dataset size: 1.37 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 660,764

wikipedia/20220620.csb

  • Config description: Wikipedia dataset for csb, parsed from 20220620 dump.

  • Download size: 2.28 MiB

  • Dataset size: 3.50 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 5,826

wikipedia/20220620.cu

  • Config description: Wikipedia dataset for cu, parsed from 20220620 dump.

  • Download size: 779.31 KiB

  • Dataset size: 996.07 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 2,242

wikipedia/20220620.cv

  • Config description: Wikipedia dataset for cv, parsed from 20220620 dump.

  • Download size: 31.43 MiB

  • Dataset size: 71.14 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 54,006

wikipedia/20220620.cy

  • Config description: Wikipedia dataset for cy, parsed from 20220620 dump.

  • Download size: 85.25 MiB

  • Dataset size: 124.71 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 177,789

wikipedia/20220620.da

  • Config description: Wikipedia dataset for da, parsed from 20220620 dump.

  • Download size: 388.66 MiB

  • Dataset size: 502.98 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 281,262

wikipedia/20220620.de

  • Config description: Wikipedia dataset for de, parsed from 20220620 dump.

  • Download size: 6.17 GiB

  • Dataset size: 8.60 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 3,536,830

wikipedia/20220620.din

  • Config description: Wikipedia dataset for din, parsed from 20220620 dump.

  • Download size: 559.45 KiB

  • Dataset size: 530.66 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 507

wikipedia/20220620.diq

  • Config description: Wikipedia dataset for diq, parsed from 20220620 dump.

  • Download size: 11.87 MiB

  • Dataset size: 17.87 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 43,366

wikipedia/20220620.dsb

  • Config description: Wikipedia dataset for dsb, parsed from 20220620 dump.

  • Download size: 3.86 MiB

  • Dataset size: 3.20 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 3,587

wikipedia/20220620.dty

  • Config description: Wikipedia dataset for dty, parsed from 20220620 dump.

  • Download size: 7.12 MiB

  • Dataset size: 6.16 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 3,604

wikipedia/20220620.dv

  • Config description: Wikipedia dataset for dv, parsed from 20220620 dump.

  • Download size: 4.55 MiB

  • Dataset size: 12.84 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 4,326

wikipedia/20220620.dz

  • Config description: Wikipedia dataset for dz, parsed from 20220620 dump.

  • Download size: 705.88 KiB

  • Dataset size: 3.49 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 428

wikipedia/20220620.ee

  • Config description: Wikipedia dataset for ee, parsed from 20220620 dump.

  • Download size: 540.72 KiB

  • Dataset size: 247.82 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 653

wikipedia/20220620.el

  • Config description: Wikipedia dataset for el, parsed from 20220620 dump.

  • Download size: 460.85 MiB

  • Dataset size: 1.16 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 293,042

wikipedia/20220620.eml

  • Config description: Wikipedia dataset for eml, parsed from 20220620 dump.

  • Download size: 9.34 MiB

  • Dataset size: 3.33 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 15,211

wikipedia/20220620.en

  • Config description: Wikipedia dataset for en, parsed from 20220620 dump.

  • Download size: 19.50 GiB

  • Dataset size: 19.14 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 6,525,137

wikipedia/20220620.eo

  • Config description: Wikipedia dataset for eo, parsed from 20220620 dump.

  • Download size: 314.80 MiB

  • Dataset size: 474.42 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 443,682

wikipedia/20220620.es

  • Config description: Wikipedia dataset for es, parsed from 20220620 dump.

  • Download size: 3.79 GiB

  • Dataset size: 5.35 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 3,142,937

wikipedia/20220620.et

  • Config description: Wikipedia dataset for et, parsed from 20220620 dump.

  • Download size: 245.98 MiB

  • Dataset size: 403.97 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 354,071

wikipedia/20220620.eu

  • Config description: Wikipedia dataset for eu, parsed from 20220620 dump.

  • Download size: 258.08 MiB

  • Dataset size: 487.68 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 502,222

wikipedia/20220620.ext

  • Config description: Wikipedia dataset for ext, parsed from 20220620 dump.

  • Download size: 2.64 MiB

  • Dataset size: 3.76 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 3,767

wikipedia/20220620.fa

  • Config description: Wikipedia dataset for fa, parsed from 20220620 dump.

  • Download size: 1.01 GiB

  • Dataset size: 1.76 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 2,714,891

wikipedia/20220620.ff

  • Config description: Wikipedia dataset for ff, parsed from 20220620 dump.

  • Download size: 654.87 KiB

  • Dataset size: 683.04 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 738

wikipedia/20220620.fi

  • Config description: Wikipedia dataset for fi, parsed from 20220620 dump.

  • Download size: 821.05 MiB

  • Dataset size: 1.02 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 732,779

wikipedia/20220620.fiu-vro

  • Config description: Wikipedia dataset for fiu-vro, parsed from 20220620 dump.

  • Download size: 2.58 MiB

  • Dataset size: 4.11 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 7,222

wikipedia/20220620.fj

  • Config description: Wikipedia dataset for fj, parsed from 20220620 dump.

  • Download size: 988.65 KiB

  • Dataset size: 544.66 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,277

wikipedia/20220620.fo

  • Config description: Wikipedia dataset for fo, parsed from 20220620 dump.

  • Download size: 14.82 MiB

  • Dataset size: 14.23 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 13,779

wikipedia/20220620.fr

  • Config description: Wikipedia dataset for fr, parsed from 20220620 dump.

  • Download size: 5.32 GiB

  • Dataset size: 7.05 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 2,432,896

wikipedia/20220620.frp

  • Config description: Wikipedia dataset for frp, parsed from 20220620 dump.

  • Download size: 4.00 MiB

  • Dataset size: 3.54 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 8,795

wikipedia/20220620.frr

  • Config description: Wikipedia dataset for frr, parsed from 20220620 dump.

  • Download size: 12.21 MiB

  • Dataset size: 9.20 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 17,780

wikipedia/20220620.fur

  • Config description: Wikipedia dataset for fur, parsed from 20220620 dump.

  • Download size: 2.57 MiB

  • Dataset size: 3.78 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 4,059

wikipedia/20220620.fy

  • Config description: Wikipedia dataset for fy, parsed from 20220620 dump.

  • Download size: 61.36 MiB

  • Dataset size: 115.38 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 48,785

wikipedia/20220620.ga

  • Config description: Wikipedia dataset for ga, parsed from 20220620 dump.

  • Download size: 33.72 MiB

  • Dataset size: 53.05 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 65,138

wikipedia/20220620.gag

  • Config description: Wikipedia dataset for gag, parsed from 20220620 dump.

  • Download size: 2.19 MiB

  • Dataset size: 2.30 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 3,122

wikipedia/20220620.gan

  • Config description: Wikipedia dataset for gan, parsed from 20220620 dump.

  • Download size: 4.21 MiB

  • Dataset size: 2.46 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 6,594

wikipedia/20220620.gd

  • Config description: Wikipedia dataset for gd, parsed from 20220620 dump.

  • Download size: 9.49 MiB

  • Dataset size: 13.35 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 15,926

wikipedia/20220620.gl

  • Config description: Wikipedia dataset for gl, parsed from 20220620 dump.

  • Download size: 301.24 MiB

  • Dataset size: 446.66 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 254,505

wikipedia/20220620.glk

  • Config description: Wikipedia dataset for glk, parsed from 20220620 dump.

  • Download size: 2.69 MiB

  • Dataset size: 5.27 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 7,925

wikipedia/20220620.gn

  • Config description: Wikipedia dataset for gn, parsed from 20220620 dump.

  • Download size: 4.30 MiB

  • Dataset size: 6.03 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 5,731

wikipedia/20220620.gom

  • Config description: Wikipedia dataset for gom, parsed from 20220620 dump.

  • Download size: 6.67 MiB

  • Dataset size: 29.01 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 4,224

wikipedia/20220620.gor

  • Config description: Wikipedia dataset for gor, parsed from 20220620 dump.

  • Download size: 3.84 MiB

  • Dataset size: 5.11 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 13,706

wikipedia/20220620.got

  • Config description: Wikipedia dataset for got, parsed from 20220620 dump.

  • Download size: 733.66 KiB

  • Dataset size: 1.35 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 979

wikipedia/20220620.gu

  • Config description: Wikipedia dataset for gu, parsed from 20220620 dump.

  • Download size: 32.14 MiB

  • Dataset size: 111.51 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 30,082

wikipedia/20220620.gv

  • Config description: Wikipedia dataset for gv, parsed from 20220620 dump.

  • Download size: 6.19 MiB

  • Dataset size: 4.86 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 5,413

wikipedia/20220620.ha

  • Config description: Wikipedia dataset for ha, parsed from 20220620 dump.

  • Download size: 21.81 MiB

  • Dataset size: 39.43 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 17,463

wikipedia/20220620.hak

  • Config description: Wikipedia dataset for hak, parsed from 20220620 dump.

  • Download size: 4.03 MiB

  • Dataset size: 4.02 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 12,102

wikipedia/20220620.haw

  • Config description: Wikipedia dataset for haw, parsed from 20220620 dump.

  • Download size: 1.21 MiB

  • Dataset size: 1.47 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 2,800

wikipedia/20220620.he

  • Config description: Wikipedia dataset for he, parsed from 20220620 dump.

  • Download size: 800.49 MiB

  • Dataset size: 1.69 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 501,548

wikipedia/20220620.hi

  • Config description: Wikipedia dataset for hi, parsed from 20220620 dump.

  • Download size: 185.43 MiB

  • Dataset size: 582.26 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 190,774

wikipedia/20220620.hif

  • Config description: Wikipedia dataset for hif, parsed from 20220620 dump.

  • Download size: 5.33 MiB

  • Dataset size: 4.87 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 10,520

wikipedia/20220620.ho

  • Config description: Wikipedia dataset for ho, parsed from 20220620 dump.

  • Download size: 19.22 KiB

  • Dataset size: 3.27 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 3

wikipedia/20220620.hr

  • Config description: Wikipedia dataset for hr, parsed from 20220620 dump.

  • Download size: 295.14 MiB

  • Dataset size: 411.76 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 248,102

wikipedia/20220620.hsb

  • Config description: Wikipedia dataset for hsb, parsed from 20220620 dump.

  • Download size: 11.08 MiB

  • Dataset size: 14.97 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 15,200

wikipedia/20220620.ht

  • Config description: Wikipedia dataset for ht, parsed from 20220620 dump.

  • Download size: 18.26 MiB

  • Dataset size: 49.21 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 67,383

wikipedia/20220620.hu

  • Config description: Wikipedia dataset for hu, parsed from 20220620 dump.

  • Download size: 994.41 MiB

  • Dataset size: 1.35 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 708,658

wikipedia/20220620.hy

  • Config description: Wikipedia dataset for hy, parsed from 20220620 dump.

  • Download size: 400.92 MiB

  • Dataset size: 1.05 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 646,281

wikipedia/20220620.ia

  • Config description: Wikipedia dataset for ia, parsed from 20220620 dump.

  • Download size: 9.61 MiB

  • Dataset size: 12.44 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 21,210

wikipedia/20220620.id

  • Config description: Wikipedia dataset for id, parsed from 20220620 dump.

  • Download size: 799.18 MiB

  • Dataset size: 1002.86 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 1,180,714

wikipedia/20220620.ie

  • Config description: Wikipedia dataset for ie, parsed from 20220620 dump.

  • Download size: 3.28 MiB

  • Dataset size: 5.34 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 10,130

wikipedia/20220620.ig

  • Config description: Wikipedia dataset for ig, parsed from 20220620 dump.

  • Download size: 11.01 MiB

  • Dataset size: 19.25 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 8,568

wikipedia/20220620.ii

  • Config description: Wikipedia dataset for ii, parsed from 20220620 dump.

  • Download size: 31.88 KiB

  • Dataset size: 8.31 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 14

wikipedia/20220620.ik

  • Config description: Wikipedia dataset for ik, parsed from 20220620 dump.

  • Download size: 306.62 KiB

  • Dataset size: 119.28 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 760

wikipedia/20220620.ilo

  • Config description: Wikipedia dataset for ilo, parsed from 20220620 dump.

  • Download size: 18.43 MiB

  • Dataset size: 15.93 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 15,357

wikipedia/20220620.inh

  • Config description: Wikipedia dataset for inh, parsed from 20220620 dump.

  • Download size: 4.37 MiB

  • Dataset size: 2.39 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 2,781

wikipedia/20220620.io

  • Config description: Wikipedia dataset for io, parsed from 20220620 dump.

  • Download size: 15.45 MiB

  • Dataset size: 32.97 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 34,849

wikipedia/20220620.is

  • Config description: Wikipedia dataset for is, parsed from 20220620 dump.

  • Download size: 52.48 MiB

  • Dataset size: 80.43 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 77,629

wikipedia/20220620.it

  • Config description: Wikipedia dataset for it, parsed from 20220620 dump.

  • Download size: 3.32 GiB

  • Dataset size: 4.29 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 2,152,760

wikipedia/20220620.iu

  • Config description: Wikipedia dataset for iu, parsed from 20220620 dump.

  • Download size: 373.91 KiB

  • Dataset size: 202.76 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 711

wikipedia/20220620.ja

  • Config description: Wikipedia dataset for ja, parsed from 20220620 dump.

  • Download size: 3.53 GiB

  • Dataset size: 6.15 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 1,652,577

wikipedia/20220620.jam

  • Config description: Wikipedia dataset for jam, parsed from 20220620 dump.

  • Download size: 954.42 KiB

  • Dataset size: 1.03 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,747

wikipedia/20220620.jbo

  • Config description: Wikipedia dataset for jbo, parsed from 20220620 dump.

  • Download size: 1.21 MiB

  • Dataset size: 2.38 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,357

wikipedia/20220620.jv

  • Config description: Wikipedia dataset for jv, parsed from 20220620 dump.

  • Download size: 53.90 MiB

  • Dataset size: 65.44 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 88,980

wikipedia/20220620.ka

  • Config description: Wikipedia dataset for ka, parsed from 20220620 dump.

  • Download size: 181.18 MiB

  • Dataset size: 621.90 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 201,554

wikipedia/20220620.kaa

  • Config description: Wikipedia dataset for kaa, parsed from 20220620 dump.

  • Download size: 1.51 MiB

  • Dataset size: 1.90 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 2,281

wikipedia/20220620.kab

  • Config description: Wikipedia dataset for kab, parsed from 20220620 dump.

  • Download size: 3.93 MiB

  • Dataset size: 3.85 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 5,508

wikipedia/20220620.kbd

  • Config description: Wikipedia dataset for kbd, parsed from 20220620 dump.

  • Download size: 1.73 MiB

  • Dataset size: 2.75 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,634

wikipedia/20220620.kbp

  • Config description: Wikipedia dataset for kbp, parsed from 20220620 dump.

  • Download size: 1.43 MiB

  • Dataset size: 3.41 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,916

wikipedia/20220620.kg

  • Config description: Wikipedia dataset for kg, parsed from 20220620 dump.

  • Download size: 517.54 KiB

  • Dataset size: 307.28 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,304

wikipedia/20220620.ki

  • Config description: Wikipedia dataset for ki, parsed from 20220620 dump.

  • Download size: 460.30 KiB

  • Dataset size: 397.67 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,630

wikipedia/20220620.kj

  • Config description: Wikipedia dataset for kj, parsed from 20220620 dump.

  • Download size: 17.46 KiB

  • Dataset size: 4.93 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 5

wikipedia/20220620.kk

  • Config description: Wikipedia dataset for kk, parsed from 20220620 dump.

  • Download size: 129.67 MiB

  • Dataset size: 442.12 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 275,494

wikipedia/20220620.kl

  • Config description: Wikipedia dataset for kl, parsed from 20220620 dump.

  • Download size: 556.04 KiB

  • Dataset size: 303.69 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 295

wikipedia/20220620.km

  • Config description: Wikipedia dataset for km, parsed from 20220620 dump.

  • Download size: 24.29 MiB

  • Dataset size: 93.46 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 13,695

wikipedia/20220620.kn

  • Config description: Wikipedia dataset for kn, parsed from 20220620 dump.

  • Download size: 80.71 MiB

  • Dataset size: 348.38 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 29,033

wikipedia/20220620.ko

  • Config description: Wikipedia dataset for ko, parsed from 20220620 dump.

  • Download size: 868.62 MiB

  • Dataset size: 1.24 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 1,277,154

wikipedia/20220620.koi

  • Config description: Wikipedia dataset for koi, parsed from 20220620 dump.

  • Download size: 2.41 MiB

  • Dataset size: 4.77 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 3,961

wikipedia/20220620.krc

  • Config description: Wikipedia dataset for krc, parsed from 20220620 dump.

  • Download size: 3.26 MiB

  • Dataset size: 4.28 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 2,345

wikipedia/20220620.ks

  • Config description: Wikipedia dataset for ks, parsed from 20220620 dump.

  • Download size: 2.88 MiB

  • Dataset size: 592.76 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,193

wikipedia/20220620.ksh

  • Config description: Wikipedia dataset for ksh, parsed from 20220620 dump.

  • Download size: 3.34 MiB

  • Dataset size: 2.99 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 3,450

wikipedia/20220620.ku

  • Config description: Wikipedia dataset for ku, parsed from 20220620 dump.

  • Download size: 27.76 MiB

  • Dataset size: 36.41 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 66,938

wikipedia/20220620.kv

  • Config description: Wikipedia dataset for kv, parsed from 20220620 dump.

  • Download size: 3.82 MiB

  • Dataset size: 8.70 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 6,852

wikipedia/20220620.kw

  • Config description: Wikipedia dataset for kw, parsed from 20220620 dump.

  • Download size: 3.43 MiB

  • Dataset size: 3.49 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 5,784

wikipedia/20220620.ky

  • Config description: Wikipedia dataset for ky, parsed from 20220620 dump.

  • Download size: 37.03 MiB

  • Dataset size: 154.22 MiB

  • Auto-cached (documentation): Only when shuffle_files=False (train)

  • Splits:

Split Examples
'train' 80,738

wikipedia/20220620.la

  • Config description: Wikipedia dataset for la, parsed from 20220620 dump.

  • Download size: 95.60 MiB

  • Dataset size: 136.08 MiB

  • Auto-cached (documentation): Only when shuffle_files=False (train)

  • Splits:

Split Examples
'train' 137,048

wikipedia/20220620.lad

  • Config description: Wikipedia dataset for lad, parsed from 20220620 dump.

  • Download size: 3.52 MiB

  • Dataset size: 4.71 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 4,949

wikipedia/20220620.lb

  • Config description: Wikipedia dataset for lb, parsed from 20220620 dump.

  • Download size: 51.86 MiB

  • Dataset size: 82.00 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 67,716

wikipedia/20220620.lbe

  • Config description: Wikipedia dataset for lbe, parsed from 20220620 dump.

  • Download size: 1.78 MiB

  • Dataset size: 703.36 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,592

wikipedia/20220620.lez

  • Config description: Wikipedia dataset for lez, parsed from 20220620 dump.

  • Download size: 6.19 MiB

  • Dataset size: 9.21 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 4,746

wikipedia/20220620.lfn

  • Config description: Wikipedia dataset for lfn, parsed from 20220620 dump.

  • Download size: 4.10 MiB

  • Dataset size: 8.39 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 4,770

wikipedia/20220620.lg

  • Config description: Wikipedia dataset for lg, parsed from 20220620 dump.

  • Download size: 1.93 MiB

  • Dataset size: 4.29 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 2,611

wikipedia/20220620.li

  • Config description: Wikipedia dataset for li, parsed from 20220620 dump.

  • Download size: 15.81 MiB

  • Dataset size: 27.30 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 16,182

wikipedia/20220620.lij

  • Config description: Wikipedia dataset for lij, parsed from 20220620 dump.

  • Download size: 6.97 MiB

  • Dataset size: 9.75 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 12,267

wikipedia/20220620.lmo

  • Config description: Wikipedia dataset for lmo, parsed from 20220620 dump.

  • Download size: 25.73 MiB

  • Dataset size: 35.00 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 60,094

wikipedia/20220620.ln

  • Config description: Wikipedia dataset for ln, parsed from 20220620 dump.

  • Download size: 2.09 MiB

  • Dataset size: 1.76 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 3,407

wikipedia/20220620.lo

  • Config description: Wikipedia dataset for lo, parsed from 20220620 dump.

  • Download size: 5.48 MiB

  • Dataset size: 13.76 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 5,021

wikipedia/20220620.lrc

  • Config description: Wikipedia dataset for lrc, parsed from 20220620 dump.

  • Download size: 23.98 KiB

  • Dataset size: 107 bytes

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1

wikipedia/20220620.lt

  • Config description: Wikipedia dataset for lt, parsed from 20220620 dump.

  • Download size: 203.00 MiB

  • Dataset size: 306.02 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 231,492

wikipedia/20220620.ltg

  • Config description: Wikipedia dataset for ltg, parsed from 20220620 dump.

  • Download size: 926.60 KiB

  • Dataset size: 867.47 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,016

wikipedia/20220620.lv

  • Config description: Wikipedia dataset for lv, parsed from 20220620 dump.

  • Download size: 162.74 MiB

  • Dataset size: 200.32 MiB

  • Auto-cached (documentation): Only when shuffle_files=False (train)

  • Splits:

Split Examples
'train' 113,939

wikipedia/20220620.mai

  • Config description: Wikipedia dataset for mai, parsed from 20220620 dump.

  • Download size: 12.25 MiB

  • Dataset size: 18.97 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 15,026

wikipedia/20220620.map-bms

  • Config description: Wikipedia dataset for map-bms, parsed from 20220620 dump.

  • Download size: 4.80 MiB

  • Dataset size: 4.67 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 14,128

wikipedia/20220620.mdf

  • Config description: Wikipedia dataset for mdf, parsed from 20220620 dump.

  • Download size: 4.01 MiB

  • Dataset size: 2.52 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 2,350

wikipedia/20220620.mg

  • Config description: Wikipedia dataset for mg, parsed from 20220620 dump.

  • Download size: 29.72 MiB

  • Dataset size: 67.45 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 132,584

wikipedia/20220620.mh

  • Config description: Wikipedia dataset for mh, parsed from 20220620 dump.

  • Download size: 28.61 KiB

  • Dataset size: 11.04 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 8

wikipedia/20220620.mhr

  • Config description: Wikipedia dataset for mhr, parsed from 20220620 dump.

  • Download size: 6.45 MiB

  • Dataset size: 17.33 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 12,654

wikipedia/20220620.mi

  • Config description: Wikipedia dataset for mi, parsed from 20220620 dump.

  • Download size: 2.17 MiB

  • Dataset size: 3.58 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 7,458

wikipedia/20220620.min

  • Config description: Wikipedia dataset for min, parsed from 20220620 dump.

  • Download size: 32.67 MiB

  • Dataset size: 102.57 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 229,488

wikipedia/20220620.mk

  • Config description: Wikipedia dataset for mk, parsed from 20220620 dump.

  • Download size: 202.46 MiB

  • Dataset size: 558.56 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 173,105

wikipedia/20220620.ml

  • Config description: Wikipedia dataset for ml, parsed from 20220620 dump.

  • Download size: 164.14 MiB

  • Dataset size: 427.62 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 145,312

wikipedia/20220620.mn

  • Config description: Wikipedia dataset for mn, parsed from 20220620 dump.

  • Download size: 36.40 MiB

  • Dataset size: 82.76 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 26,307

wikipedia/20220620.mr

  • Config description: Wikipedia dataset for mr, parsed from 20220620 dump.

  • Download size: 69.63 MiB

  • Dataset size: 224.55 MiB

  • Auto-cached (documentation): Only when shuffle_files=False (train)

  • Splits:

Split Examples
'train' 140,718

wikipedia/20220620.mrj

  • Config description: Wikipedia dataset for mrj, parsed from 20220620 dump.

  • Download size: 3.33 MiB

  • Dataset size: 8.32 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 10,823

wikipedia/20220620.ms

  • Config description: Wikipedia dataset for ms, parsed from 20220620 dump.

  • Download size: 285.64 MiB

  • Dataset size: 370.33 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 406,579

wikipedia/20220620.mt

  • Config description: Wikipedia dataset for mt, parsed from 20220620 dump.

  • Download size: 12.31 MiB

  • Dataset size: 19.98 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 6,106

wikipedia/20220620.mus

  • Config description: Wikipedia dataset for mus, parsed from 20220620 dump.

  • Download size: 15.06 KiB

  • Dataset size: 875 bytes

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 2

wikipedia/20220620.mwl

  • Config description: Wikipedia dataset for mwl, parsed from 20220620 dump.

  • Download size: 9.27 MiB

  • Dataset size: 18.44 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 4,460

wikipedia/20220620.my

  • Config description: Wikipedia dataset for my, parsed from 20220620 dump.

  • Download size: 56.21 MiB

  • Dataset size: 256.54 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 105,760

wikipedia/20220620.myv

  • Config description: Wikipedia dataset for myv, parsed from 20220620 dump.

  • Download size: 11.17 MiB

  • Dataset size: 9.86 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 8,689

wikipedia/20220620.mzn

  • Config description: Wikipedia dataset for mzn, parsed from 20220620 dump.

  • Download size: 7.56 MiB

  • Dataset size: 11.76 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 19,262

wikipedia/20220620.na

  • Config description: Wikipedia dataset for na, parsed from 20220620 dump.

  • Download size: 707.01 KiB

  • Dataset size: 381.16 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,682

wikipedia/20220620.nah

  • Config description: Wikipedia dataset for nah, parsed from 20220620 dump.

  • Download size: 4.92 MiB

  • Dataset size: 3.43 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 10,798

wikipedia/20220620.nap

  • Config description: Wikipedia dataset for nap, parsed from 20220620 dump.

  • Download size: 5.43 MiB

  • Dataset size: 6.02 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 15,383

wikipedia/20220620.nds

  • Config description: Wikipedia dataset for nds, parsed from 20220620 dump.

  • Download size: 43.27 MiB

  • Dataset size: 87.59 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 91,692

wikipedia/20220620.nds-nl

  • Config description: Wikipedia dataset for nds-nl, parsed from 20220620 dump.

  • Download size: 8.36 MiB

  • Dataset size: 12.60 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 11,453

wikipedia/20220620.ne

  • Config description: Wikipedia dataset for ne, parsed from 20220620 dump.

  • Download size: 40.89 MiB

  • Dataset size: 93.50 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 32,590

wikipedia/20220620.new

  • Config description: Wikipedia dataset for new, parsed from 20220620 dump.

  • Download size: 17.36 MiB

  • Dataset size: 140.47 MiB

  • Auto-cached (documentation): Only when shuffle_files=False (train)

  • Splits:

Split Examples
'train' 73,019

wikipedia/20220620.ng

  • Config description: Wikipedia dataset for ng, parsed from 20220620 dump.

  • Download size: 92.18 KiB

  • Dataset size: 66.12 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 21

wikipedia/20220620.nl

  • Config description: Wikipedia dataset for nl, parsed from 20220620 dump.

  • Download size: 1.66 GiB

  • Dataset size: 2.36 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 2,606,249

wikipedia/20220620.nn

  • Config description: Wikipedia dataset for nn, parsed from 20220620 dump.

  • Download size: 149.76 MiB

  • Dataset size: 222.98 MiB

  • Auto-cached (documentation): Only when shuffle_files=False (train)

  • Splits:

Split Examples
'train' 239,952

wikipedia/20220620.no

  • Config description: Wikipedia dataset for no, parsed from 20220620 dump.

  • Download size: 707.72 MiB

  • Dataset size: 967.03 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 914,633

wikipedia/20220620.nov

  • Config description: Wikipedia dataset for nov, parsed from 20220620 dump.

  • Download size: 1.24 MiB

  • Dataset size: 837.95 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,612

wikipedia/20220620.nrm

  • Config description: Wikipedia dataset for nrm, parsed from 20220620 dump.

  • Download size: 1.99 MiB

  • Dataset size: 2.95 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 4,875

wikipedia/20220620.nso

  • Config description: Wikipedia dataset for nso, parsed from 20220620 dump.

  • Download size: 2.58 MiB

  • Dataset size: 2.21 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 8,569

wikipedia/20220620.nv

  • Config description: Wikipedia dataset for nv, parsed from 20220620 dump.

  • Download size: 5.35 MiB

  • Dataset size: 12.92 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 20,455

wikipedia/20220620.ny

  • Config description: Wikipedia dataset for ny, parsed from 20220620 dump.

  • Download size: 2.11 MiB

  • Dataset size: 1.38 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,042

wikipedia/20220620.oc

  • Config description: Wikipedia dataset for oc, parsed from 20220620 dump.

  • Download size: 77.75 MiB

  • Dataset size: 113.60 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 96,491

wikipedia/20220620.olo

  • Config description: Wikipedia dataset for olo, parsed from 20220620 dump.

  • Download size: 2.15 MiB

  • Dataset size: 2.79 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 4,895

wikipedia/20220620.om

  • Config description: Wikipedia dataset for om, parsed from 20220620 dump.

  • Download size: 1.47 MiB

  • Dataset size: 2.26 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,186

wikipedia/20220620.or

  • Config description: Wikipedia dataset for or, parsed from 20220620 dump.

  • Download size: 30.85 MiB

  • Dataset size: 64.33 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 31,228

wikipedia/20220620.os

  • Config description: Wikipedia dataset for os, parsed from 20220620 dump.

  • Download size: 14.61 MiB

  • Dataset size: 11.52 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 18,618

wikipedia/20220620.pa

  • Config description: Wikipedia dataset for pa, parsed from 20220620 dump.

  • Download size: 55.75 MiB

  • Dataset size: 150.59 MiB

  • Auto-cached (documentation): Only when shuffle_files=False (train)

  • Splits:

Split Examples
'train' 48,283

wikipedia/20220620.pag

  • Config description: Wikipedia dataset for pag, parsed from 20220620 dump.

  • Download size: 1.73 MiB

  • Dataset size: 1.74 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 4,976

wikipedia/20220620.pam

  • Config description: Wikipedia dataset for pam, parsed from 20220620 dump.

  • Download size: 9.18 MiB

  • Dataset size: 7.35 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 8,860

wikipedia/20220620.pap

  • Config description: Wikipedia dataset for pap, parsed from 20220620 dump.

  • Download size: 2.22 MiB

  • Dataset size: 2.61 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 2,560

wikipedia/20220620.pcd

  • Config description: Wikipedia dataset for pcd, parsed from 20220620 dump.

  • Download size: 5.24 MiB

  • Dataset size: 5.26 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 5,503

wikipedia/20220620.pdc

  • Config description: Wikipedia dataset for pdc, parsed from 20220620 dump.

  • Download size: 1.20 MiB

  • Dataset size: 1.10 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 2,445

wikipedia/20220620.pfl

  • Config description: Wikipedia dataset for pfl, parsed from 20220620 dump.

  • Download size: 3.67 MiB

  • Dataset size: 3.60 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 3,984

wikipedia/20220620.pi

  • Config description: Wikipedia dataset for pi, parsed from 20220620 dump.

  • Download size: 683.35 KiB

  • Dataset size: 970.64 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 3,062

wikipedia/20220620.pih

  • Config description: Wikipedia dataset for pih, parsed from 20220620 dump.

  • Download size: 810.93 KiB

  • Dataset size: 254.49 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 929

wikipedia/20220620.pl

  • Config description: Wikipedia dataset for pl, parsed from 20220620 dump.

  • Download size: 2.17 GiB

  • Dataset size: 2.65 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 1,895,000

wikipedia/20220620.pms

  • Config description: Wikipedia dataset for pms, parsed from 20220620 dump.

  • Download size: 14.32 MiB

  • Dataset size: 31.32 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 67,584

wikipedia/20220620.pnb

  • Config description: Wikipedia dataset for pnb, parsed from 20220620 dump.

  • Download size: 89.96 MiB

  • Dataset size: 255.14 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 75,225

wikipedia/20220620.pnt

  • Config description: Wikipedia dataset for pnt, parsed from 20220620 dump.

  • Download size: 587.71 KiB

  • Dataset size: 638.65 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 541

wikipedia/20220620.ps

  • Config description: Wikipedia dataset for ps, parsed from 20220620 dump.

  • Download size: 32.21 MiB

  • Dataset size: 73.66 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 16,390

wikipedia/20220620.pt

  • Config description: Wikipedia dataset for pt, parsed from 20220620 dump.

  • Download size: 1.97 GiB

  • Dataset size: 2.45 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 1,558,114

wikipedia/20220620.qu

  • Config description: Wikipedia dataset for qu, parsed from 20220620 dump.

  • Download size: 12.93 MiB

  • Dataset size: 16.11 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 31,780

wikipedia/20220620.rm

  • Config description: Wikipedia dataset for rm, parsed from 20220620 dump.

  • Download size: 7.20 MiB

  • Dataset size: 17.15 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 3,927

wikipedia/20220620.rmy

  • Config description: Wikipedia dataset for rmy, parsed from 20220620 dump.

  • Download size: 575.79 KiB

  • Dataset size: 350.86 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 754

wikipedia/20220620.rn

  • Config description: Wikipedia dataset for rn, parsed from 20220620 dump.

  • Download size: 883.75 KiB

  • Dataset size: 406.46 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 753

wikipedia/20220620.ro

  • Config description: Wikipedia dataset for ro, parsed from 20220620 dump.

  • Download size: 585.89 MiB

  • Dataset size: 758.03 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 431,837

wikipedia/20220620.roa-rup

  • Config description: Wikipedia dataset for roa-rup, parsed from 20220620 dump.

  • Download size: 1.09 MiB

  • Dataset size: 1.18 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,336

wikipedia/20220620.roa-tara

  • Config description: Wikipedia dataset for roa-tara, parsed from 20220620 dump.

  • Download size: 6.41 MiB

  • Dataset size: 6.68 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 9,400

wikipedia/20220620.ru

  • Config description: Wikipedia dataset for ru, parsed from 20220620 dump.

  • Download size: 4.56 GiB

  • Dataset size: 9.00 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 3,006,938

wikipedia/20220620.rue

  • Config description: Wikipedia dataset for rue, parsed from 20220620 dump.

  • Download size: 6.29 MiB

  • Dataset size: 11.52 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 9,004

wikipedia/20220620.rw

  • Config description: Wikipedia dataset for rw, parsed from 20220620 dump.

  • Download size: 4.51 MiB

  • Dataset size: 4.34 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 3,779

wikipedia/20220620.sa

  • Config description: Wikipedia dataset for sa, parsed from 20220620 dump.

  • Download size: 16.11 MiB

  • Dataset size: 61.87 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 22,360

wikipedia/20220620.sah

  • Config description: Wikipedia dataset for sah, parsed from 20220620 dump.

  • Download size: 15.46 MiB

  • Dataset size: 42.08 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 18,245

wikipedia/20220620.sat

  • Config description: Wikipedia dataset for sat, parsed from 20220620 dump.

  • Download size: 13.03 MiB

  • Dataset size: 29.36 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 7,279

wikipedia/20220620.sc

  • Config description: Wikipedia dataset for sc, parsed from 20220620 dump.

  • Download size: 7.38 MiB

  • Dataset size: 11.67 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 7,466

wikipedia/20220620.scn

  • Config description: Wikipedia dataset for scn, parsed from 20220620 dump.

  • Download size: 12.22 MiB

  • Dataset size: 16.71 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 31,591

wikipedia/20220620.sco

  • Config description: Wikipedia dataset for sco, parsed from 20220620 dump.

  • Download size: 57.20 MiB

  • Dataset size: 46.30 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 40,637

wikipedia/20220620.sd

  • Config description: Wikipedia dataset for sd, parsed from 20220620 dump.

  • Download size: 19.65 MiB

  • Dataset size: 34.39 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 22,393

wikipedia/20220620.se

  • Config description: Wikipedia dataset for se, parsed from 20220620 dump.

  • Download size: 3.98 MiB

  • Dataset size: 3.40 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 8,682

wikipedia/20220620.sg

  • Config description: Wikipedia dataset for sg, parsed from 20220620 dump.

  • Download size: 367.62 KiB

  • Dataset size: 116.16 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 547

wikipedia/20220620.sh

  • Config description: Wikipedia dataset for sh, parsed from 20220620 dump.

  • Download size: 432.85 MiB

  • Dataset size: 831.01 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 3,940,273

wikipedia/20220620.si

  • Config description: Wikipedia dataset for si, parsed from 20220620 dump.

  • Download size: 46.16 MiB

  • Dataset size: 122.93 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 30,744

wikipedia/20220620.simple

  • Config description: Wikipedia dataset for simple, parsed from 20220620 dump.

  • Download size: 240.42 MiB

  • Dataset size: 250.39 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 212,585

wikipedia/20220620.sk

  • Config description: Wikipedia dataset for sk, parsed from 20220620 dump.

  • Download size: 293.23 MiB

  • Dataset size: 377.66 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 261,341

wikipedia/20220620.sl

  • Config description: Wikipedia dataset for sl, parsed from 20220620 dump.

  • Download size: 259.06 MiB

  • Dataset size: 401.87 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 213,134

wikipedia/20220620.sm

  • Config description: Wikipedia dataset for sm, parsed from 20220620 dump.

  • Download size: 928.26 KiB

  • Dataset size: 817.31 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,126

wikipedia/20220620.sn

  • Config description: Wikipedia dataset for sn, parsed from 20220620 dump.

  • Download size: 4.14 MiB

  • Dataset size: 7.20 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 8,578

wikipedia/20220620.so

  • Config description: Wikipedia dataset for so, parsed from 20220620 dump.

  • Download size: 11.42 MiB

  • Dataset size: 12.50 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 10,005

wikipedia/20220620.sq

  • Config description: Wikipedia dataset for sq, parsed from 20220620 dump.

  • Download size: 103.12 MiB

  • Dataset size: 171.60 MiB

  • Auto-cached (documentation): Only when shuffle_files=False (train)

  • Splits:

Split Examples
'train' 118,180

wikipedia/20220620.sr

  • Config description: Wikipedia dataset for sr, parsed from 20220620 dump.

  • Download size: 903.80 MiB

  • Dataset size: 1.74 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 3,153,469

wikipedia/20220620.srn

  • Config description: Wikipedia dataset for srn, parsed from 20220620 dump.

  • Download size: 655.74 KiB

  • Dataset size: 607.60 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,280

wikipedia/20220620.ss

  • Config description: Wikipedia dataset for ss, parsed from 20220620 dump.

  • Download size: 875.15 KiB

  • Dataset size: 509.74 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 587

wikipedia/20220620.st

  • Config description: Wikipedia dataset for st, parsed from 20220620 dump.

  • Download size: 2.26 MiB

  • Dataset size: 742.07 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 998

wikipedia/20220620.stq

  • Config description: Wikipedia dataset for stq, parsed from 20220620 dump.

  • Download size: 3.52 MiB

  • Dataset size: 4.72 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 4,543

wikipedia/20220620.su

  • Config description: Wikipedia dataset for su, parsed from 20220620 dump.

  • Download size: 27.31 MiB

  • Dataset size: 42.80 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 68,098

wikipedia/20220620.sv

  • Config description: Wikipedia dataset for sv, parsed from 20220620 dump.

  • Download size: 1.43 GiB

  • Dataset size: 2.06 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 4,276,131

wikipedia/20220620.sw

  • Config description: Wikipedia dataset for sw, parsed from 20220620 dump.

  • Download size: 41.82 MiB

  • Dataset size: 64.89 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 73,964

wikipedia/20220620.szl

  • Config description: Wikipedia dataset for szl, parsed from 20220620 dump.

  • Download size: 13.47 MiB

  • Dataset size: 18.72 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 56,591

wikipedia/20220620.ta

  • Config description: Wikipedia dataset for ta, parsed from 20220620 dump.

  • Download size: 191.30 MiB

  • Dataset size: 710.38 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 182,598

wikipedia/20220620.tcy

  • Config description: Wikipedia dataset for tcy, parsed from 20220620 dump.

  • Download size: 4.53 MiB

  • Dataset size: 9.75 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,850

wikipedia/20220620.te

  • Config description: Wikipedia dataset for te, parsed from 20220620 dump.

  • Download size: 126.77 MiB

  • Dataset size: 635.52 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 103,548

wikipedia/20220620.tet

  • Config description: Wikipedia dataset for tet, parsed from 20220620 dump.

  • Download size: 1.34 MiB

  • Dataset size: 1.36 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,638

wikipedia/20220620.tg

  • Config description: Wikipedia dataset for tg, parsed from 20220620 dump.

  • Download size: 49.23 MiB

  • Dataset size: 124.97 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 116,122

wikipedia/20220620.th

  • Config description: Wikipedia dataset for th, parsed from 20220620 dump.

  • Download size: 338.86 MiB

  • Dataset size: 904.68 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 270,315

wikipedia/20220620.ti

  • Config description: Wikipedia dataset for ti, parsed from 20220620 dump.

  • Download size: 811.06 KiB

  • Dataset size: 404.59 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 388

wikipedia/20220620.tk

  • Config description: Wikipedia dataset for tk, parsed from 20220620 dump.

  • Download size: 5.81 MiB

  • Dataset size: 11.62 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 7,558

wikipedia/20220620.tl

  • Config description: Wikipedia dataset for tl, parsed from 20220620 dump.

  • Download size: 67.10 MiB

  • Dataset size: 73.43 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 42,898

wikipedia/20220620.tn

  • Config description: Wikipedia dataset for tn, parsed from 20220620 dump.

  • Download size: 1.61 MiB

  • Dataset size: 1.47 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 864

wikipedia/20220620.to

  • Config description: Wikipedia dataset for to, parsed from 20220620 dump.

  • Download size: 886.72 KiB

  • Dataset size: 968.03 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,778

wikipedia/20220620.tpi

  • Config description: Wikipedia dataset for tpi, parsed from 20220620 dump.

  • Download size: 1.51 MiB

  • Dataset size: 438.38 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,731

wikipedia/20220620.tr

  • Config description: Wikipedia dataset for tr, parsed from 20220620 dump.

  • Download size: 761.13 MiB

  • Dataset size: 883.37 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 758,043

wikipedia/20220620.ts

  • Config description: Wikipedia dataset for ts, parsed from 20220620 dump.

  • Download size: 1.76 MiB

  • Dataset size: 735.07 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 730

wikipedia/20220620.tt

  • Config description: Wikipedia dataset for tt, parsed from 20220620 dump.

  • Download size: 105.25 MiB

  • Dataset size: 474.25 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 461,299

wikipedia/20220620.tum

  • Config description: Wikipedia dataset for tum, parsed from 20220620 dump.

  • Download size: 534.00 KiB

  • Dataset size: 572.43 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 2,833

wikipedia/20220620.tw

  • Config description: Wikipedia dataset for tw, parsed from 20220620 dump.

  • Download size: 2.37 MiB

  • Dataset size: 2.54 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 2,113

wikipedia/20220620.ty

  • Config description: Wikipedia dataset for ty, parsed from 20220620 dump.

  • Download size: 569.95 KiB

  • Dataset size: 281.62 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 1,342

wikipedia/20220620.tyv

  • Config description: Wikipedia dataset for tyv, parsed from 20220620 dump.

  • Download size: 5.10 MiB

  • Dataset size: 12.74 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 4,089

wikipedia/20220620.udm

  • Config description: Wikipedia dataset for udm, parsed from 20220620 dump.

  • Download size: 3.65 MiB

  • Dataset size: 6.27 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 6,336

wikipedia/20220620.ug

  • Config description: Wikipedia dataset for ug, parsed from 20220620 dump.

  • Download size: 8.14 MiB

  • Dataset size: 37.29 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 7,920

wikipedia/20220620.uk

  • Config description: Wikipedia dataset for uk, parsed from 20220620 dump.

  • Download size: 1.86 GiB

  • Dataset size: 4.16 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 1,778,278

wikipedia/20220620.ur

  • Config description: Wikipedia dataset for ur, parsed from 20220620 dump.

  • Download size: 192.55 MiB

  • Dataset size: 328.28 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 367,085

wikipedia/20220620.uz

  • Config description: Wikipedia dataset for uz, parsed from 20220620 dump.

  • Download size: 92.94 MiB

  • Dataset size: 145.17 MiB

  • Auto-cached (documentation): Only when shuffle_files=False (train)

  • Splits:

Split Examples
'train' 214,458

wikipedia/20220620.ve

  • Config description: Wikipedia dataset for ve, parsed from 20220620 dump.

  • Download size: 323.40 KiB

  • Dataset size: 277.52 KiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 715

wikipedia/20220620.vec

  • Config description: Wikipedia dataset for vec, parsed from 20220620 dump.

  • Download size: 27.17 MiB

  • Dataset size: 34.31 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 77,580

wikipedia/20220620.vep

  • Config description: Wikipedia dataset for vep, parsed from 20220620 dump.

  • Download size: 7.74 MiB

  • Dataset size: 10.04 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 8,203

wikipedia/20220620.vi

  • Config description: Wikipedia dataset for vi, parsed from 20220620 dump.

  • Download size: 889.55 MiB

  • Dataset size: 1.43 GiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 1,501,068

wikipedia/20220620.vls

  • Config description: Wikipedia dataset for vls, parsed from 20220620 dump.

  • Download size: 7.32 MiB

  • Dataset size: 10.73 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 8,129