cawac
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
参考:
使用以下命令在 TFDS 中加载此数据集:
ds = tfds.load('huggingface:cawac')
caWaC is a 780-million-token web corpus of Catalan built from the .cat top-level-domain in late 2013.
- 许可:CC BY-SA 3.0
- 版本:0.0.0
- 拆分:
{
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2022-09-21。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["没有我需要的信息","missingTheInformationINeed","thumb-down"],["太复杂/步骤太多","tooComplicatedTooManySteps","thumb-down"],["内容需要更新","outOfDate","thumb-down"],["翻译问题","translationIssue","thumb-down"],["示例/代码问题","samplesCodeIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2022-09-21。"],[],[]]