TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

chexpert

Description:

CheXpert is a large dataset of chest X-rays and competition for automated chest x-ray interpretation, which features uncertainty labels and radiologist-labeled reference standard evaluation sets. It consists of 224,316 chest radiographs of 65,240 patients, where the chest radiographic examinations and the associated radiology reports were retrospectively collected from Stanford Hospital. Each report was labeled for the presence of 14 observations as positive, negative, or uncertain. We decided on the 14 observations based on the prevalence in the reports and clinical relevance.

The CheXpert dataset must be downloaded separately after reading and agreeing to a Research Use Agreement. To do so, please follow the instructions on the website, https://stanfordmlgroup.github.io/competitions/chexpert/

Additional Documentation: Explore on Papers With Code
Homepage: https://stanfordmlgroup.github.io/competitions/chexpert/
Source code: tfds.image_classification.Chexpert
Versions:
- 3.1.0 (default): No release notes.
Download size: Unknown size
Dataset size: Unknown size
Manual download instructions: This dataset requires you to download the source data manually into download_config.manual_dir (defaults to ~/tensorflow_datasets/downloads/manual/):
You must register and agree to user agreement on the dataset page: https://stanfordmlgroup.github.io/competitions/chexpert/ Afterwards, you have to put the CheXpert-v1.0-small directory in the manual_dir. It should contain subdirectories: train/ and valid/ with images and also train.csv and valid.csv files.
Auto-cached (documentation): Unknown
Splits:

Split	Examples

Feature structure:

FeaturesDict({
    'image': Image(shape=(None, None, 3), dtype=uint8),
    'image_view': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'label': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=4)),
    'name': Text(shape=(), dtype=string),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
image	Image	(None, None, 3)	uint8
image_view	ClassLabel		int64
label	Sequence(ClassLabel)	(None,)	int64
name	Text		string

Supervised keys (See as_supervised doc): ('image', 'label')
Figure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe): Missing.
Citation:

@article{DBLP:journals/corr/abs-1901-07031,
  author    = {Jeremy Irvin and Pranav Rajpurkar and Michael Ko and Yifan Yu and Silviana Ciurea{-}Ilcus and Chris Chute and Henrik Marklund and Behzad Haghgoo and Robyn L. Ball and Katie Shpanskaya and Jayne Seekins and David A. Mong and Safwan S. Halabi and Jesse K. Sandberg and Ricky Jones and David B. Larson and Curtis P. Langlotz and Bhavik N. Patel and Matthew P. Lungren and Andrew Y. Ng},
  title     = {CheXpert: {A} Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison},
  journal   = {CoRR},
  volume    = {abs/1901.07031},
  year      = {2019},
  url       = {http://arxiv.org/abs/1901.07031},
  archivePrefix = {arXiv},
  eprint    = {1901.07031},
  timestamp = {Fri, 01 Feb 2019 13:39:59 +0100},
  biburl    = {https://dblp.org/rec/bib/journals/corr/abs-1901-07031},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}