Registration is open for TensorFlow Dev Summit 2020 Learn more


The PlantVillage dataset consists of 54303 healthy and unhealthy leaf images divided into 38 categories by species and disease.

NOTE: The original dataset is not available from the original source (, therefore we get the unaugmented dataset from a paper that used that dataset and republished it. Moreover, we dropped images with Background_without_leaves label, because these were not present in the original dataset.

Original paper URL: Dataset URL:


    'image': Image(shape=(None, None, 3), dtype=tf.uint8),
    'image/filename': Text(shape=(), dtype=tf.string),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=38),


Split Examples
ALL 54,303
TRAIN 54,303


Supervised keys (for as_supervised=True)

('image', 'label')


  author    = {David P. Hughes and
               Marcel Salath{'{e}}},
  title     = {An open access repository of images on plant health to enable the
               development of mobile disease diagnostics through machine
               learning and crowdsourcing},
  journal   = {CoRR},
  volume    = {abs/1511.08060},
  year      = {2015},
  url       = {},
  archivePrefix = {arXiv},
  eprint    = {1511.08060},
  timestamp = {Mon, 13 Aug 2018 16:48:21 +0200},
  biburl    = {},
  bibsource = {dblp computer science bibliography,}