TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

bigearthnet

Description:

The BigEarthNet is a new large-scale Sentinel-2 benchmark archive, consisting of 590,326 Sentinel-2 image patches. The image patch size on the ground is 1.2 x 1.2 km with variable image size depending on the channel resolution. This is a multi-label dataset with 43 imbalanced labels.

To construct the BigEarthNet, 125 Sentinel-2 tiles acquired between June 2017 and May 2018 over the 10 countries (Austria, Belgium, Finland, Ireland, Kosovo, Lithuania, Luxembourg, Portugal, Serbia, Switzerland) of Europe were initially selected. All the tiles were atmospherically corrected by the Sentinel-2 Level 2A product generation and formatting tool (sen2cor). Then, they were divided into 590,326 non-overlapping image patches. Each image patch was annotated by the multiple land-cover classes (i.e., multi-labels) that were provided from the CORINE Land Cover database of the year 2018 (CLC 2018).

Bands and pixel resolution in meters:

B01: Coastal aerosol; 60m
B02: Blue; 10m
B03: Green; 10m
B04: Red; 10m
B05: Vegetation red edge; 20m
B06: Vegetation red edge; 20m
B07: Vegetation red edge; 20m
B08: NIR; 10m
B09: Water vapor; 60m
B11: SWIR; 20m
B12: SWIR; 20m
B8A: Narrow NIR; 20m

License: Community Data License Agreement - Permissive, Version 1.0.

URL: http://bigearth.net/

Additional Documentation: Explore on Papers With Code
Homepage: http://bigearth.net
Source code: tfds.datasets.bigearthnet.Builder
Versions:
- 1.0.0 (default): New split API (https://tensorflow.org/datasets/splits)
Download size: 65.22 GiB
Auto-cached (documentation): No
Splits:

Split	Examples
`'train'`	590,326

Citation:

@article{Sumbul2019BigEarthNetAL,
  title={BigEarthNet: A Large-Scale Benchmark Archive For Remote Sensing Image Understanding},
  author={Gencer Sumbul and Marcela Charfuelan and Beg{"u}m Demir and Volker Markl},
  journal={CoRR},
  year={2019},
  volume={abs/1902.06148}
}

bigearthnet/rgb (default config)

Config description: Sentinel-2 RGB channels
Dataset size: 14.07 GiB
Feature structure:

FeaturesDict({
    'filename': Text(shape=(), dtype=string),
    'image': Image(shape=(120, 120, 3), dtype=uint8),
    'labels': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=43)),
    'metadata': FeaturesDict({
        'acquisition_date': Text(shape=(), dtype=string),
        'coordinates': FeaturesDict({
            'lrx': int64,
            'lry': int64,
            'ulx': int64,
            'uly': int64,
        }),
        'projection': Text(shape=(), dtype=string),
        'tile_source': Text(shape=(), dtype=string),
    }),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
filename	Text		string
image	Image	(120, 120, 3)	uint8
labels	Sequence(ClassLabel)	(None,)	int64
metadata	FeaturesDict
metadata/acquisition_date	Text		string
metadata/coordinates	FeaturesDict
metadata/coordinates/lrx	Tensor		int64
metadata/coordinates/lry	Tensor		int64
metadata/coordinates/ulx	Tensor		int64
metadata/coordinates/uly	Tensor		int64
metadata/projection	Text		string
metadata/tile_source	Text		string

Supervised keys (See as_supervised doc): ('image', 'labels')
Figure (tfds.show_examples):

Visualization

Examples (tfds.as_dataframe):

bigearthnet/all

Config description: 13 Sentinel-2 channels
Dataset size: 176.63 GiB
Feature structure:

FeaturesDict({
    'B01': Tensor(shape=(20, 20), dtype=float32),
    'B02': Tensor(shape=(120, 120), dtype=float32),
    'B03': Tensor(shape=(120, 120), dtype=float32),
    'B04': Tensor(shape=(120, 120), dtype=float32),
    'B05': Tensor(shape=(60, 60), dtype=float32),
    'B06': Tensor(shape=(60, 60), dtype=float32),
    'B07': Tensor(shape=(60, 60), dtype=float32),
    'B08': Tensor(shape=(120, 120), dtype=float32),
    'B09': Tensor(shape=(20, 20), dtype=float32),
    'B11': Tensor(shape=(60, 60), dtype=float32),
    'B12': Tensor(shape=(60, 60), dtype=float32),
    'B8A': Tensor(shape=(60, 60), dtype=float32),
    'filename': Text(shape=(), dtype=string),
    'labels': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=43)),
    'metadata': FeaturesDict({
        'acquisition_date': Text(shape=(), dtype=string),
        'coordinates': FeaturesDict({
            'lrx': int64,
            'lry': int64,
            'ulx': int64,
            'uly': int64,
        }),
        'projection': Text(shape=(), dtype=string),
        'tile_source': Text(shape=(), dtype=string),
    }),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
B01	Tensor	(20, 20)	float32
B02	Tensor	(120, 120)	float32
B03	Tensor	(120, 120)	float32
B04	Tensor	(120, 120)	float32
B05	Tensor	(60, 60)	float32
B06	Tensor	(60, 60)	float32
B07	Tensor	(60, 60)	float32
B08	Tensor	(120, 120)	float32
B09	Tensor	(20, 20)	float32
B11	Tensor	(60, 60)	float32
B12	Tensor	(60, 60)	float32
B8A	Tensor	(60, 60)	float32
filename	Text		string
labels	Sequence(ClassLabel)	(None,)	int64
metadata	FeaturesDict
metadata/acquisition_date	Text		string
metadata/coordinates	FeaturesDict
metadata/coordinates/lrx	Tensor		int64
metadata/coordinates/lry	Tensor		int64
metadata/coordinates/ulx	Tensor		int64
metadata/coordinates/uly	Tensor		int64
metadata/projection	Text		string
metadata/tile_source	Text		string

Supervised keys (See as_supervised doc): None
Figure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):