penguins

  • Description:

Measurements for three penguin species observed in the Palmer Archipelago, Antarctica.

These data were collected from 2007 - 2009 by Dr. Kristen Gorman with the Palmer Station Long Term Ecological Research Program, part of the US Long Term Ecological Research Network. The data were originally imported from the Environmental Data Initiative (EDI) Data Portal, and are available for use by CC0 license ("No Rights Reserved") in accordance with the Palmer Station Data Policy. This copy was imported from Allison Horst's GitHub repository.

@Manual{,
  title = {palmerpenguins: Palmer Archipelago (Antarctica) penguin data},
  author = {Allison Marie Horst and Alison Presmanes Hill and Kristen B Gorman},
  year = {2020},
  note = {R package version 0.1.0},
  doi = {10.5281/zenodo.3960218},
  url = {https://allisonhorst.github.io/palmerpenguins/},
}

penguins/processed (default config)

  • Config description: penguins/processed is a drop-in replacement for the iris dataset. It contains 4 normalised numerical features presented as a single tensor, no missing values and the class label (species) is presented as an integer (n = 334).

  • Download size: 25.05 KiB

  • Dataset size: 17.61 KiB

  • Splits:

Split Examples
'train' 334
  • Feature structure:
FeaturesDict({
    'features': Tensor(shape=(4,), dtype=tf.float32),
    'species': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
features Tensor (4,) tf.float32
species ClassLabel tf.int64

penguins/simple

  • Config description: penguins/simple has been processed from the raw dataset, with simplified class labels derived from text fields, missing values marked as NaN/NA and retains only 7 significant features (n = 344).

  • Download size: 13.20 KiB

  • Dataset size: 56.10 KiB

  • Splits:

Split Examples
'train' 344
  • Feature structure:
FeaturesDict({
    'body_mass_g': tf.float32,
    'culmen_depth_mm': tf.float32,
    'culmen_length_mm': tf.float32,
    'flipper_length_mm': tf.float32,
    'island': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sex': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'species': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
body_mass_g Tensor tf.float32
culmen_depth_mm Tensor tf.float32
culmen_length_mm Tensor tf.float32
flipper_length_mm Tensor tf.float32
island ClassLabel tf.int64
sex ClassLabel tf.int64
species ClassLabel tf.int64
  • Supervised keys (See as_supervised doc): ({'body_mass_g': 'body_mass_g', 'culmen_depth_mm': 'culmen_depth_mm', 'culmen_length_mm': 'culmen_length_mm', 'flipper_length_mm': 'flipper_length_mm', 'island': 'island', 'sex': 'sex', 'species': 'species'}, 'species')

  • Examples (tfds.as_dataframe):

penguins/raw

  • Config description: penguins/raw is the original, unprocessed copy from @allisonhorst, containing all 17 features, presented either as numeric types or as raw text (n = 344).

  • Download size: 49.72 KiB

  • Dataset size: 164.51 KiB

  • Splits:

Split Examples
'train' 344
  • Feature structure:
FeaturesDict({
    'Body Mass (g)': tf.float32,
    'Clutch Completion': Text(shape=(), dtype=tf.string),
    'Comments': Text(shape=(), dtype=tf.string),
    'Culmen Depth (mm)': tf.float32,
    'Culmen Length (mm)': tf.float32,
    'Date Egg': Text(shape=(), dtype=tf.string),
    'Delta 13 C (o/oo)': tf.float32,
    'Delta 15 N (o/oo)': tf.float32,
    'Flipper Length (mm)': tf.float32,
    'Individual ID': Text(shape=(), dtype=tf.string),
    'Island': Text(shape=(), dtype=tf.string),
    'Region': Text(shape=(), dtype=tf.string),
    'Sample Number': tf.int32,
    'Sex': Text(shape=(), dtype=tf.string),
    'Species': Text(shape=(), dtype=tf.string),
    'Stage': Text(shape=(), dtype=tf.string),
    'studyName': Text(shape=(), dtype=tf.string),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
Body Mass (g) Tensor tf.float32
Clutch Completion Text tf.string
Comments Text tf.string
Culmen Depth (mm) Tensor tf.float32
Culmen Length (mm) Tensor tf.float32
Date Egg Text tf.string
Delta 13 C (o/oo) Tensor tf.float32
Delta 15 N (o/oo) Tensor tf.float32
Flipper Length (mm) Tensor tf.float32
Individual ID Text tf.string
Island Text tf.string
Region Text tf.string
Sample Number Tensor tf.int32
Sex Text tf.string
Species Text tf.string
Stage Text tf.string
studyName Text tf.string