• Description:

CheXpert is a large dataset of chest X-rays and competition for automated chest x-ray interpretation, which features uncertainty labels and radiologist-labeled reference standard evaluation sets. It consists of 224,316 chest radiographs of 65,240 patients, where the chest radiographic examinations and the associated radiology reports were retrospectively collected from Stanford Hospital. Each report was labeled for the presence of 14 observations as positive, negative, or uncertain. We decided on the 14 observations based on the prevalence in the reports and clinical relevance.

The CheXpert dataset must be downloaded separately after reading and agreeing to a Research Use Agreement. To do so, please follow the instructions on the website,

Split Examples
  • Feature structure:
    'image': Image(shape=(None, None, 3), dtype=uint8),
    'image_view': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'label': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=4)),
    'name': Text(shape=(), dtype=string),
  • Feature documentation:
Feature Class Shape Dtype Description
image Image (None, None, 3) uint8
image_view ClassLabel int64
label Sequence(ClassLabel) (None,) int64
name Text string
