Ayuda a proteger la Gran Barrera de Coral con TensorFlow en Kaggle Únete Challenge

istella

  • Descripción:

Los conjuntos de datos de Istella son tres conjuntos de datos de aprendizaje para clasificar a gran escala publicados por Istella. Cada conjunto de datos consta de pares de documentos de consulta representados como vectores de características y las etiquetas de juicio de relevancia correspondientes.

El conjunto de datos contiene tres versiones:

  • main ( "Istella Letor"): Contiene 10,454,629 pares consulta de documentos.
  • s ( "Istella-S Letor"): Contiene 3,408,630 pares consulta de documentos.
  • x ( "Istella-X Letor"): Contiene 26,791,447 pares consulta de documentos.

Puede especificar si desea utilizar los main , s o x versión del conjunto de datos de la siguiente manera:

ds = tfds.load("istella/main")
ds = tfds.load("istella/s")
ds = tfds.load("istella/x")

Si sólo se istella se especifica, el istella/main opción está seleccionada de forma predeterminada:

# This is the same as `tfds.load("istella/main")`
ds = tfds.load("istella")
FeaturesDict({
    'feature_1': Tensor(shape=(None,), dtype=tf.float64),
    'feature_10': Tensor(shape=(None,), dtype=tf.float64),
    'feature_100': Tensor(shape=(None,), dtype=tf.float64),
    'feature_101': Tensor(shape=(None,), dtype=tf.float64),
    'feature_102': Tensor(shape=(None,), dtype=tf.float64),
    'feature_103': Tensor(shape=(None,), dtype=tf.float64),
    'feature_104': Tensor(shape=(None,), dtype=tf.float64),
    'feature_105': Tensor(shape=(None,), dtype=tf.float64),
    'feature_106': Tensor(shape=(None,), dtype=tf.float64),
    'feature_107': Tensor(shape=(None,), dtype=tf.float64),
    'feature_108': Tensor(shape=(None,), dtype=tf.float64),
    'feature_109': Tensor(shape=(None,), dtype=tf.float64),
    'feature_11': Tensor(shape=(None,), dtype=tf.float64),
    'feature_110': Tensor(shape=(None,), dtype=tf.float64),
    'feature_111': Tensor(shape=(None,), dtype=tf.float64),
    'feature_112': Tensor(shape=(None,), dtype=tf.float64),
    'feature_113': Tensor(shape=(None,), dtype=tf.float64),
    'feature_114': Tensor(shape=(None,), dtype=tf.float64),
    'feature_115': Tensor(shape=(None,), dtype=tf.float64),
    'feature_116': Tensor(shape=(None,), dtype=tf.float64),
    'feature_117': Tensor(shape=(None,), dtype=tf.float64),
    'feature_118': Tensor(shape=(None,), dtype=tf.float64),
    'feature_119': Tensor(shape=(None,), dtype=tf.float64),
    'feature_12': Tensor(shape=(None,), dtype=tf.float64),
    'feature_120': Tensor(shape=(None,), dtype=tf.float64),
    'feature_121': Tensor(shape=(None,), dtype=tf.float64),
    'feature_122': Tensor(shape=(None,), dtype=tf.float64),
    'feature_123': Tensor(shape=(None,), dtype=tf.float64),
    'feature_124': Tensor(shape=(None,), dtype=tf.float64),
    'feature_125': Tensor(shape=(None,), dtype=tf.float64),
    'feature_126': Tensor(shape=(None,), dtype=tf.float64),
    'feature_127': Tensor(shape=(None,), dtype=tf.float64),
    'feature_128': Tensor(shape=(None,), dtype=tf.float64),
    'feature_129': Tensor(shape=(None,), dtype=tf.float64),
    'feature_13': Tensor(shape=(None,), dtype=tf.float64),
    'feature_130': Tensor(shape=(None,), dtype=tf.float64),
    'feature_131': Tensor(shape=(None,), dtype=tf.float64),
    'feature_132': Tensor(shape=(None,), dtype=tf.float64),
    'feature_133': Tensor(shape=(None,), dtype=tf.float64),
    'feature_134': Tensor(shape=(None,), dtype=tf.float64),
    'feature_135': Tensor(shape=(None,), dtype=tf.float64),
    'feature_136': Tensor(shape=(None,), dtype=tf.float64),
    'feature_137': Tensor(shape=(None,), dtype=tf.float64),
    'feature_138': Tensor(shape=(None,), dtype=tf.float64),
    'feature_139': Tensor(shape=(None,), dtype=tf.float64),
    'feature_14': Tensor(shape=(None,), dtype=tf.float64),
    'feature_140': Tensor(shape=(None,), dtype=tf.float64),
    'feature_141': Tensor(shape=(None,), dtype=tf.float64),
    'feature_142': Tensor(shape=(None,), dtype=tf.float64),
    'feature_143': Tensor(shape=(None,), dtype=tf.float64),
    'feature_144': Tensor(shape=(None,), dtype=tf.float64),
    'feature_145': Tensor(shape=(None,), dtype=tf.float64),
    'feature_146': Tensor(shape=(None,), dtype=tf.float64),
    'feature_147': Tensor(shape=(None,), dtype=tf.float64),
    'feature_148': Tensor(shape=(None,), dtype=tf.float64),
    'feature_149': Tensor(shape=(None,), dtype=tf.float64),
    'feature_15': Tensor(shape=(None,), dtype=tf.float64),
    'feature_150': Tensor(shape=(None,), dtype=tf.float64),
    'feature_151': Tensor(shape=(None,), dtype=tf.float64),
    'feature_152': Tensor(shape=(None,), dtype=tf.float64),
    'feature_153': Tensor(shape=(None,), dtype=tf.float64),
    'feature_154': Tensor(shape=(None,), dtype=tf.float64),
    'feature_155': Tensor(shape=(None,), dtype=tf.float64),
    'feature_156': Tensor(shape=(None,), dtype=tf.float64),
    'feature_157': Tensor(shape=(None,), dtype=tf.float64),
    'feature_158': Tensor(shape=(None,), dtype=tf.float64),
    'feature_159': Tensor(shape=(None,), dtype=tf.float64),
    'feature_16': Tensor(shape=(None,), dtype=tf.float64),
    'feature_160': Tensor(shape=(None,), dtype=tf.float64),
    'feature_161': Tensor(shape=(None,), dtype=tf.float64),
    'feature_162': Tensor(shape=(None,), dtype=tf.float64),
    'feature_163': Tensor(shape=(None,), dtype=tf.float64),
    'feature_164': Tensor(shape=(None,), dtype=tf.float64),
    'feature_165': Tensor(shape=(None,), dtype=tf.float64),
    'feature_166': Tensor(shape=(None,), dtype=tf.float64),
    'feature_167': Tensor(shape=(None,), dtype=tf.float64),
    'feature_168': Tensor(shape=(None,), dtype=tf.float64),
    'feature_169': Tensor(shape=(None,), dtype=tf.float64),
    'feature_17': Tensor(shape=(None,), dtype=tf.float64),
    'feature_170': Tensor(shape=(None,), dtype=tf.float64),
    'feature_171': Tensor(shape=(None,), dtype=tf.float64),
    'feature_172': Tensor(shape=(None,), dtype=tf.float64),
    'feature_173': Tensor(shape=(None,), dtype=tf.float64),
    'feature_174': Tensor(shape=(None,), dtype=tf.float64),
    'feature_175': Tensor(shape=(None,), dtype=tf.float64),
    'feature_176': Tensor(shape=(None,), dtype=tf.float64),
    'feature_177': Tensor(shape=(None,), dtype=tf.float64),
    'feature_178': Tensor(shape=(None,), dtype=tf.float64),
    'feature_179': Tensor(shape=(None,), dtype=tf.float64),
    'feature_18': Tensor(shape=(None,), dtype=tf.float64),
    'feature_180': Tensor(shape=(None,), dtype=tf.float64),
    'feature_181': Tensor(shape=(None,), dtype=tf.float64),
    'feature_182': Tensor(shape=(None,), dtype=tf.float64),
    'feature_183': Tensor(shape=(None,), dtype=tf.float64),
    'feature_184': Tensor(shape=(None,), dtype=tf.float64),
    'feature_185': Tensor(shape=(None,), dtype=tf.float64),
    'feature_186': Tensor(shape=(None,), dtype=tf.float64),
    'feature_187': Tensor(shape=(None,), dtype=tf.float64),
    'feature_188': Tensor(shape=(None,), dtype=tf.float64),
    'feature_189': Tensor(shape=(None,), dtype=tf.float64),
    'feature_19': Tensor(shape=(None,), dtype=tf.float64),
    'feature_190': Tensor(shape=(None,), dtype=tf.float64),
    'feature_191': Tensor(shape=(None,), dtype=tf.float64),
    'feature_192': Tensor(shape=(None,), dtype=tf.float64),
    'feature_193': Tensor(shape=(None,), dtype=tf.float64),
    'feature_194': Tensor(shape=(None,), dtype=tf.float64),
    'feature_195': Tensor(shape=(None,), dtype=tf.float64),
    'feature_196': Tensor(shape=(None,), dtype=tf.float64),
    'feature_197': Tensor(shape=(None,), dtype=tf.float64),
    'feature_198': Tensor(shape=(None,), dtype=tf.float64),
    'feature_199': Tensor(shape=(None,), dtype=tf.float64),
    'feature_2': Tensor(shape=(None,), dtype=tf.float64),
    'feature_20': Tensor(shape=(None,), dtype=tf.float64),
    'feature_200': Tensor(shape=(None,), dtype=tf.float64),
    'feature_201': Tensor(shape=(None,), dtype=tf.float64),
    'feature_202': Tensor(shape=(None,), dtype=tf.float64),
    'feature_203': Tensor(shape=(None,), dtype=tf.float64),
    'feature_204': Tensor(shape=(None,), dtype=tf.float64),
    'feature_205': Tensor(shape=(None,), dtype=tf.float64),
    'feature_206': Tensor(shape=(None,), dtype=tf.float64),
    'feature_207': Tensor(shape=(None,), dtype=tf.float64),
    'feature_208': Tensor(shape=(None,), dtype=tf.float64),
    'feature_209': Tensor(shape=(None,), dtype=tf.float64),
    'feature_21': Tensor(shape=(None,), dtype=tf.float64),
    'feature_210': Tensor(shape=(None,), dtype=tf.float64),
    'feature_211': Tensor(shape=(None,), dtype=tf.float64),
    'feature_212': Tensor(shape=(None,), dtype=tf.float64),
    'feature_213': Tensor(shape=(None,), dtype=tf.float64),
    'feature_214': Tensor(shape=(None,), dtype=tf.float64),
    'feature_215': Tensor(shape=(None,), dtype=tf.float64),
    'feature_216': Tensor(shape=(None,), dtype=tf.float64),
    'feature_217': Tensor(shape=(None,), dtype=tf.float64),
    'feature_218': Tensor(shape=(None,), dtype=tf.float64),
    'feature_219': Tensor(shape=(None,), dtype=tf.float64),
    'feature_22': Tensor(shape=(None,), dtype=tf.float64),
    'feature_220': Tensor(shape=(None,), dtype=tf.float64),
    'feature_23': Tensor(shape=(None,), dtype=tf.float64),
    'feature_24': Tensor(shape=(None,), dtype=tf.float64),
    'feature_25': Tensor(shape=(None,), dtype=tf.float64),
    'feature_26': Tensor(shape=(None,), dtype=tf.float64),
    'feature_27': Tensor(shape=(None,), dtype=tf.float64),
    'feature_28': Tensor(shape=(None,), dtype=tf.float64),
    'feature_29': Tensor(shape=(None,), dtype=tf.float64),
    'feature_3': Tensor(shape=(None,), dtype=tf.float64),
    'feature_30': Tensor(shape=(None,), dtype=tf.float64),
    'feature_31': Tensor(shape=(None,), dtype=tf.float64),
    'feature_32': Tensor(shape=(None,), dtype=tf.float64),
    'feature_33': Tensor(shape=(None,), dtype=tf.float64),
    'feature_34': Tensor(shape=(None,), dtype=tf.float64),
    'feature_35': Tensor(shape=(None,), dtype=tf.float64),
    'feature_36': Tensor(shape=(None,), dtype=tf.float64),
    'feature_37': Tensor(shape=(None,), dtype=tf.float64),
    'feature_38': Tensor(shape=(None,), dtype=tf.float64),
    'feature_39': Tensor(shape=(None,), dtype=tf.float64),
    'feature_4': Tensor(shape=(None,), dtype=tf.float64),
    'feature_40': Tensor(shape=(None,), dtype=tf.float64),
    'feature_41': Tensor(shape=(None,), dtype=tf.float64),
    'feature_42': Tensor(shape=(None,), dtype=tf.float64),
    'feature_43': Tensor(shape=(None,), dtype=tf.float64),
    'feature_44': Tensor(shape=(None,), dtype=tf.float64),
    'feature_45': Tensor(shape=(None,), dtype=tf.float64),
    'feature_46': Tensor(shape=(None,), dtype=tf.float64),
    'feature_47': Tensor(shape=(None,), dtype=tf.float64),
    'feature_48': Tensor(shape=(None,), dtype=tf.float64),
    'feature_49': Tensor(shape=(None,), dtype=tf.float64),
    'feature_5': Tensor(shape=(None,), dtype=tf.float64),
    'feature_50': Tensor(shape=(None,), dtype=tf.float64),
    'feature_51': Tensor(shape=(None,), dtype=tf.float64),
    'feature_52': Tensor(shape=(None,), dtype=tf.float64),
    'feature_53': Tensor(shape=(None,), dtype=tf.float64),
    'feature_54': Tensor(shape=(None,), dtype=tf.float64),
    'feature_55': Tensor(shape=(None,), dtype=tf.float64),
    'feature_56': Tensor(shape=(None,), dtype=tf.float64),
    'feature_57': Tensor(shape=(None,), dtype=tf.float64),
    'feature_58': Tensor(shape=(None,), dtype=tf.float64),
    'feature_59': Tensor(shape=(None,), dtype=tf.float64),
    'feature_6': Tensor(shape=(None,), dtype=tf.float64),
    'feature_60': Tensor(shape=(None,), dtype=tf.float64),
    'feature_61': Tensor(shape=(None,), dtype=tf.float64),
    'feature_62': Tensor(shape=(None,), dtype=tf.float64),
    'feature_63': Tensor(shape=(None,), dtype=tf.float64),
    'feature_64': Tensor(shape=(None,), dtype=tf.float64),
    'feature_65': Tensor(shape=(None,), dtype=tf.float64),
    'feature_66': Tensor(shape=(None,), dtype=tf.float64),
    'feature_67': Tensor(shape=(None,), dtype=tf.float64),
    'feature_68': Tensor(shape=(None,), dtype=tf.float64),
    'feature_69': Tensor(shape=(None,), dtype=tf.float64),
    'feature_7': Tensor(shape=(None,), dtype=tf.float64),
    'feature_70': Tensor(shape=(None,), dtype=tf.float64),
    'feature_71': Tensor(shape=(None,), dtype=tf.float64),
    'feature_72': Tensor(shape=(None,), dtype=tf.float64),
    'feature_73': Tensor(shape=(None,), dtype=tf.float64),
    'feature_74': Tensor(shape=(None,), dtype=tf.float64),
    'feature_75': Tensor(shape=(None,), dtype=tf.float64),
    'feature_76': Tensor(shape=(None,), dtype=tf.float64),
    'feature_77': Tensor(shape=(None,), dtype=tf.float64),
    'feature_78': Tensor(shape=(None,), dtype=tf.float64),
    'feature_79': Tensor(shape=(None,), dtype=tf.float64),
    'feature_8': Tensor(shape=(None,), dtype=tf.float64),
    'feature_80': Tensor(shape=(None,), dtype=tf.float64),
    'feature_81': Tensor(shape=(None,), dtype=tf.float64),
    'feature_82': Tensor(shape=(None,), dtype=tf.float64),
    'feature_83': Tensor(shape=(None,), dtype=tf.float64),
    'feature_84': Tensor(shape=(None,), dtype=tf.float64),
    'feature_85': Tensor(shape=(None,), dtype=tf.float64),
    'feature_86': Tensor(shape=(None,), dtype=tf.float64),
    'feature_87': Tensor(shape=(None,), dtype=tf.float64),
    'feature_88': Tensor(shape=(None,), dtype=tf.float64),
    'feature_89': Tensor(shape=(None,), dtype=tf.float64),
    'feature_9': Tensor(shape=(None,), dtype=tf.float64),
    'feature_90': Tensor(shape=(None,), dtype=tf.float64),
    'feature_91': Tensor(shape=(None,), dtype=tf.float64),
    'feature_92': Tensor(shape=(None,), dtype=tf.float64),
    'feature_93': Tensor(shape=(None,), dtype=tf.float64),
    'feature_94': Tensor(shape=(None,), dtype=tf.float64),
    'feature_95': Tensor(shape=(None,), dtype=tf.float64),
    'feature_96': Tensor(shape=(None,), dtype=tf.float64),
    'feature_97': Tensor(shape=(None,), dtype=tf.float64),
    'feature_98': Tensor(shape=(None,), dtype=tf.float64),
    'feature_99': Tensor(shape=(None,), dtype=tf.float64),
    'label': Tensor(shape=(None,), dtype=tf.float64),
})
@article{10.1145/2987380,
  author = {Dato, Domenico and Lucchese, Claudio and Nardini, Franco Maria and Orlando, Salvatore and Perego, Raffaele and Tonellotto, Nicola and Venturini, Rossano},
  title = {Fast Ranking with Additive Ensembles of Oblivious and Non-Oblivious Regression Trees},
  year = {2016},
  publisher = {ACM},
  address = {New York, NY, USA},
  volume = {35},
  number = {2},
  issn = {1046-8188},
  url = {https://doi.org/10.1145/2987380},
  doi = {10.1145/2987380},
  journal = {ACM Transactions on Information Systems},
  articleno = {15},
  numpages = {31},
}

istella / main (configuración predeterminada)

  • Tamaño del paquete: 1.20 GiB

  • Tamaño de conjunto de datos: 1.40 GiB

  • Fraccionamientos:

Separar Ejemplos de
'test' 9,799
'train' 23,219

istella / s

  • Tamaño del paquete: 450.26 MiB

  • Conjunto de datos de tamaño: 728.40 MiB

  • Fraccionamientos:

Separar Ejemplos de
'test' 6.562
'train' 19,245
'vali' 7.211

istella / x

  • Tamaño del paquete: 4.42 GiB

  • Tamaño de conjunto de datos: 2.06 GiB

  • Fraccionamientos:

Separar Ejemplos de
'test' 2.000
'train' 6.000
'vali' 2.000