bprec

Références:

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:bprec')
  • Descriptif :
Dataset consisting of Polish language texts annotated to recognize brand-product relations.
  • Licence : Aucune licence connue
  • Version : 1.1.0
  • Fractionnements :
Diviser Exemples
'banking' 561
'cosmetics' 2384
'electro' 382
'tele' 2391
  • Caractéristiques :
{
    "id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "ner": {
        "feature": {
            "source": {
                "from": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "text": {
                    "dtype": "string",
                    "id": null,
                    "_type": "Value"
                },
                "to": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "type": {
                    "num_classes": 10,
                    "names": [
                        "PRODUCT_NAME",
                        "PRODUCT_NAME_IMP",
                        "PRODUCT_NO_BRAND",
                        "BRAND_NAME",
                        "BRAND_NAME_IMP",
                        "VERSION",
                        "PRODUCT_ADJ",
                        "BRAND_ADJ",
                        "LOCATION",
                        "LOCATION_IMP"
                    ],
                    "names_file": null,
                    "id": null,
                    "_type": "ClassLabel"
                }
            },
            "target": {
                "from": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "text": {
                    "dtype": "string",
                    "id": null,
                    "_type": "Value"
                },
                "to": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "type": {
                    "num_classes": 10,
                    "names": [
                        "PRODUCT_NAME",
                        "PRODUCT_NAME_IMP",
                        "PRODUCT_NO_BRAND",
                        "BRAND_NAME",
                        "BRAND_NAME_IMP",
                        "VERSION",
                        "PRODUCT_ADJ",
                        "BRAND_ADJ",
                        "LOCATION",
                        "LOCATION_IMP"
                    ],
                    "names_file": null,
                    "id": null,
                    "_type": "ClassLabel"
                }
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

tout

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:bprec/all')
  • Descriptif :
Dataset consisting of Polish language texts annotated to recognize brand-product relations.
  • Licence : Aucune licence connue
  • Version : 1.1.0
  • Fractionnements :
Diviser Exemples
'train' 5718
  • Caractéristiques :
{
    "id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "category": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "ner": {
        "feature": {
            "source": {
                "from": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "text": {
                    "dtype": "string",
                    "id": null,
                    "_type": "Value"
                },
                "to": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "type": {
                    "num_classes": 10,
                    "names": [
                        "PRODUCT_NAME",
                        "PRODUCT_NAME_IMP",
                        "PRODUCT_NO_BRAND",
                        "BRAND_NAME",
                        "BRAND_NAME_IMP",
                        "VERSION",
                        "PRODUCT_ADJ",
                        "BRAND_ADJ",
                        "LOCATION",
                        "LOCATION_IMP"
                    ],
                    "names_file": null,
                    "id": null,
                    "_type": "ClassLabel"
                }
            },
            "target": {
                "from": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "text": {
                    "dtype": "string",
                    "id": null,
                    "_type": "Value"
                },
                "to": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "type": {
                    "num_classes": 10,
                    "names": [
                        "PRODUCT_NAME",
                        "PRODUCT_NAME_IMP",
                        "PRODUCT_NO_BRAND",
                        "BRAND_NAME",
                        "BRAND_NAME_IMP",
                        "VERSION",
                        "PRODUCT_ADJ",
                        "BRAND_ADJ",
                        "LOCATION",
                        "LOCATION_IMP"
                    ],
                    "names_file": null,
                    "id": null,
                    "_type": "ClassLabel"
                }
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

télé

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:bprec/tele')
  • Descriptif :
Dataset consisting of Polish language texts annotated to recognize brand-product relations.
  • Licence : Aucune licence connue
  • Version : 1.1.0
  • Fractionnements :
Diviser Exemples
'train' 2391
  • Caractéristiques :
{
    "id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "category": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "ner": {
        "feature": {
            "source": {
                "from": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "text": {
                    "dtype": "string",
                    "id": null,
                    "_type": "Value"
                },
                "to": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "type": {
                    "num_classes": 10,
                    "names": [
                        "PRODUCT_NAME",
                        "PRODUCT_NAME_IMP",
                        "PRODUCT_NO_BRAND",
                        "BRAND_NAME",
                        "BRAND_NAME_IMP",
                        "VERSION",
                        "PRODUCT_ADJ",
                        "BRAND_ADJ",
                        "LOCATION",
                        "LOCATION_IMP"
                    ],
                    "names_file": null,
                    "id": null,
                    "_type": "ClassLabel"
                }
            },
            "target": {
                "from": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "text": {
                    "dtype": "string",
                    "id": null,
                    "_type": "Value"
                },
                "to": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "type": {
                    "num_classes": 10,
                    "names": [
                        "PRODUCT_NAME",
                        "PRODUCT_NAME_IMP",
                        "PRODUCT_NO_BRAND",
                        "BRAND_NAME",
                        "BRAND_NAME_IMP",
                        "VERSION",
                        "PRODUCT_ADJ",
                        "BRAND_ADJ",
                        "LOCATION",
                        "LOCATION_IMP"
                    ],
                    "names_file": null,
                    "id": null,
                    "_type": "ClassLabel"
                }
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

électro

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:bprec/electro')
  • Descriptif :
Dataset consisting of Polish language texts annotated to recognize brand-product relations.
  • Licence : Aucune licence connue
  • Version : 1.1.0
  • Fractionnements :
Diviser Exemples
'train' 382
  • Caractéristiques :
{
    "id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "category": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "ner": {
        "feature": {
            "source": {
                "from": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "text": {
                    "dtype": "string",
                    "id": null,
                    "_type": "Value"
                },
                "to": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "type": {
                    "num_classes": 10,
                    "names": [
                        "PRODUCT_NAME",
                        "PRODUCT_NAME_IMP",
                        "PRODUCT_NO_BRAND",
                        "BRAND_NAME",
                        "BRAND_NAME_IMP",
                        "VERSION",
                        "PRODUCT_ADJ",
                        "BRAND_ADJ",
                        "LOCATION",
                        "LOCATION_IMP"
                    ],
                    "names_file": null,
                    "id": null,
                    "_type": "ClassLabel"
                }
            },
            "target": {
                "from": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "text": {
                    "dtype": "string",
                    "id": null,
                    "_type": "Value"
                },
                "to": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "type": {
                    "num_classes": 10,
                    "names": [
                        "PRODUCT_NAME",
                        "PRODUCT_NAME_IMP",
                        "PRODUCT_NO_BRAND",
                        "BRAND_NAME",
                        "BRAND_NAME_IMP",
                        "VERSION",
                        "PRODUCT_ADJ",
                        "BRAND_ADJ",
                        "LOCATION",
                        "LOCATION_IMP"
                    ],
                    "names_file": null,
                    "id": null,
                    "_type": "ClassLabel"
                }
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

produits de beauté

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:bprec/cosmetics')
  • Descriptif :
Dataset consisting of Polish language texts annotated to recognize brand-product relations.
  • Licence : Aucune licence connue
  • Version : 1.1.0
  • Fractionnements :
Diviser Exemples
'train' 2384
  • Caractéristiques :
{
    "id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "category": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "ner": {
        "feature": {
            "source": {
                "from": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "text": {
                    "dtype": "string",
                    "id": null,
                    "_type": "Value"
                },
                "to": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "type": {
                    "num_classes": 10,
                    "names": [
                        "PRODUCT_NAME",
                        "PRODUCT_NAME_IMP",
                        "PRODUCT_NO_BRAND",
                        "BRAND_NAME",
                        "BRAND_NAME_IMP",
                        "VERSION",
                        "PRODUCT_ADJ",
                        "BRAND_ADJ",
                        "LOCATION",
                        "LOCATION_IMP"
                    ],
                    "names_file": null,
                    "id": null,
                    "_type": "ClassLabel"
                }
            },
            "target": {
                "from": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "text": {
                    "dtype": "string",
                    "id": null,
                    "_type": "Value"
                },
                "to": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "type": {
                    "num_classes": 10,
                    "names": [
                        "PRODUCT_NAME",
                        "PRODUCT_NAME_IMP",
                        "PRODUCT_NO_BRAND",
                        "BRAND_NAME",
                        "BRAND_NAME_IMP",
                        "VERSION",
                        "PRODUCT_ADJ",
                        "BRAND_ADJ",
                        "LOCATION",
                        "LOCATION_IMP"
                    ],
                    "names_file": null,
                    "id": null,
                    "_type": "ClassLabel"
                }
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

bancaire

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:bprec/banking')
  • Descriptif :
Dataset consisting of Polish language texts annotated to recognize brand-product relations.
  • Licence : Aucune licence connue
  • Version : 1.1.0
  • Fractionnements :
Diviser Exemples
'train' 561
  • Caractéristiques :
{
    "id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "category": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "ner": {
        "feature": {
            "source": {
                "from": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "text": {
                    "dtype": "string",
                    "id": null,
                    "_type": "Value"
                },
                "to": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "type": {
                    "num_classes": 10,
                    "names": [
                        "PRODUCT_NAME",
                        "PRODUCT_NAME_IMP",
                        "PRODUCT_NO_BRAND",
                        "BRAND_NAME",
                        "BRAND_NAME_IMP",
                        "VERSION",
                        "PRODUCT_ADJ",
                        "BRAND_ADJ",
                        "LOCATION",
                        "LOCATION_IMP"
                    ],
                    "names_file": null,
                    "id": null,
                    "_type": "ClassLabel"
                }
            },
            "target": {
                "from": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "text": {
                    "dtype": "string",
                    "id": null,
                    "_type": "Value"
                },
                "to": {
                    "dtype": "int32",
                    "id": null,
                    "_type": "Value"
                },
                "type": {
                    "num_classes": 10,
                    "names": [
                        "PRODUCT_NAME",
                        "PRODUCT_NAME_IMP",
                        "PRODUCT_NO_BRAND",
                        "BRAND_NAME",
                        "BRAND_NAME_IMP",
                        "VERSION",
                        "PRODUCT_ADJ",
                        "BRAND_ADJ",
                        "LOCATION",
                        "LOCATION_IMP"
                    ],
                    "names_file": null,
                    "id": null,
                    "_type": "ClassLabel"
                }
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}