The batting averages of 18 Major League Baseball players through their first 45 at-bats of the 1970 season, along with their batting average for the remainder the season.

The data has been modified from the table in the paper, as used for case studies using Stan and PyMC3, by adding columns explicitly listing the number of at-bats early in the season, as well as at-bats and hits for the full season.

Split Examples
'train' 18
  • Feature structure:
    'At-Bats': tf.int32,
    'BattingAverage': tf.float32,
    'FirstName': tf.string,
    'Hits': tf.int32,
    'LastName': tf.string,
    'RemainingAt-Bats': tf.int32,
    'RemainingAverage': tf.float32,
    'SeasonAt-Bats': tf.int32,
    'SeasonAverage': tf.float32,
    'SeasonHits': tf.int32,
  • Feature documentation:
Feature Class Shape Dtype Description
At-Bats Tensor tf.int32
BattingAverage Tensor tf.float32
FirstName Tensor tf.string
Hits Tensor tf.int32
LastName Tensor tf.string
RemainingAt-Bats Tensor tf.int32
RemainingAverage Tensor tf.float32
SeasonAt-Bats Tensor tf.int32
SeasonAverage Tensor tf.float32
SeasonHits Tensor tf.int32
  • Citation:
  title={Data analysis using Stein's estimator and its generalizations},
  author={Efron, Bradley and Morris, Carl},
  journal={Journal of the American Statistical Association},
  publisher={Taylor \& Francis}