View on TensorFlow.org | Run in Google Colab | View source on GitHub | Download notebook |

In a *regression* problem, we aim to predict the output of a continuous value, like a price or a probability. Contrast this with a *classification* problem, where we aim to select a class from a list of classes (for example, where a picture contains an apple or an orange, recognizing which fruit is in the picture).

This notebook uses the classic Auto MPG Dataset and builds a model to predict the fuel efficiency of late-1970s and early 1980s automobiles. To do this, we'll provide the model with a description of many automobiles from that time period. This description includes attributes like: cylinders, displacement, horsepower, and weight.

This example uses the `tf.keras`

API, see this guide for details.

```
# Use seaborn for pairplot
!pip install -q seaborn
# Use some functions from tensorflow_docs
!pip install -q git+https://github.com/tensorflow/docs
```

WARNING: You are using pip version 20.2.2; however, version 20.2.3 is available. You should consider upgrading via the '/tmpfs/src/tf_docs_env/bin/python -m pip install --upgrade pip' command. WARNING: You are using pip version 20.2.2; however, version 20.2.3 is available. You should consider upgrading via the '/tmpfs/src/tf_docs_env/bin/python -m pip install --upgrade pip' command.

```
import pathlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
```

```
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
print(tf.__version__)
```

2.3.0

```
import tensorflow_docs as tfdocs
import tensorflow_docs.plots
import tensorflow_docs.modeling
```

## The Auto MPG dataset

The dataset is available from the UCI Machine Learning Repository.

### Get the data

First download the dataset.

```
dataset_path = keras.utils.get_file("auto-mpg.data", "http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data")
dataset_path
```

Downloading data from http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data 32768/30286 [================================] - 0s 4us/step '/home/kbuilder/.keras/datasets/auto-mpg.data'

Import it using pandas

```
column_names = ['MPG','Cylinders','Displacement','Horsepower','Weight',
'Acceleration', 'Model Year', 'Origin']
dataset = pd.read_csv(dataset_path, names=column_names,
na_values = "?", comment='\t',
sep=" ", skipinitialspace=True)
dataset.tail()
```

### Clean the data

The dataset contains a few unknown values.

```
dataset.isna().sum()
```

MPG 0 Cylinders 0 Displacement 0 Horsepower 6 Weight 0 Acceleration 0 Model Year 0 Origin 0 dtype: int64

To keep this initial tutorial simple drop those rows.

```
dataset = dataset.dropna()
```

The `"Origin"`

column is really categorical, not numeric. So convert that to a one-hot:

```
dataset['Origin'] = dataset['Origin'].map({1: 'USA', 2: 'Europe', 3: 'Japan'})
```

```
dataset = pd.get_dummies(dataset, prefix='', prefix_sep='')
dataset.tail()
```

### Split the data into train and test

Now split the dataset into a training set and a test set.

We will use the test set in the final evaluation of our model.

```
train_dataset = dataset.sample(frac=0.8,random_state=0)
test_dataset = dataset.drop(train_dataset.index)
```

### Inspect the data

Have a quick look at the joint distribution of a few pairs of columns from the training set.

```
sns.pairplot(train_dataset[["MPG", "Cylinders", "Displacement", "Weight"]], diag_kind="kde")
```

<seaborn.axisgrid.PairGrid at 0x7fa7241d2be0>

Also look at the overall statistics:

```
train_stats = train_dataset.describe()
train_stats.pop("MPG")
train_stats = train_stats.transpose()
train_stats
```

### Split features from labels

Separate the target value, or "label", from the features. This label is the value that you will train the model to predict.

```
train_labels = train_dataset.pop('MPG')
test_labels = test_dataset.pop('MPG')
```

### Normalize the data

Look again at the `train_stats`

block above and note how different the ranges of each feature are.

It is good practice to normalize features that use different scales and ranges. Although the model *might* converge without feature normalization, it makes training more difficult, and it makes the resulting model dependent on the choice of units used in the input.

```
def norm(x):
return (x - train_stats['mean']) / train_stats['std']
normed_train_data = norm(train_dataset)
normed_test_data = norm(test_dataset)
```

This normalized data is what we will use to train the model.

## The model

### Build the model

Let's build our model. Here, we'll use a `Sequential`

model with two densely connected hidden layers, and an output layer that returns a single, continuous value. The model building steps are wrapped in a function, `build_model`

, since we'll create a second model, later on.

```
def build_model():
model = keras.Sequential([
layers.Dense(64, activation='relu', input_shape=[len(train_dataset.keys())]),
layers.Dense(64, activation='relu'),
layers.Dense(1)
])
optimizer = tf.keras.optimizers.RMSprop(0.001)
model.compile(loss='mse',
optimizer=optimizer,
metrics=['mae', 'mse'])
return model
```

```
model = build_model()
```

### Inspect the model

Use the `.summary`

method to print a simple description of the model

```
model.summary()
```

Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense (Dense) (None, 64) 640 _________________________________________________________________ dense_1 (Dense) (None, 64) 4160 _________________________________________________________________ dense_2 (Dense) (None, 1) 65 ================================================================= Total params: 4,865 Trainable params: 4,865 Non-trainable params: 0 _________________________________________________________________

Now try out the model. Take a batch of `10`

examples from the training data and call `model.predict`

on it.

```
example_batch = normed_train_data[:10]
example_result = model.predict(example_batch)
example_result
```

array([[0.12610257], [0.14759323], [0.08230598], [0.23927972], [0.2432359 ], [0.06868052], [0.31277505], [0.3320441 ], [0.03968456], [0.1880572 ]], dtype=float32)

It seems to be working, and it produces a result of the expected shape and type.

### Train the model

Train the model for 1000 epochs, and record the training and validation accuracy in the `history`

object.

```
EPOCHS = 1000
history = model.fit(
normed_train_data, train_labels,
epochs=EPOCHS, validation_split = 0.2, verbose=0,
callbacks=[tfdocs.modeling.EpochDots()])
```

Epoch: 0, loss:557.0610, mae:22.3910, mse:557.0610, val_loss:540.2362, val_mae:22.0212, val_mse:540.2362, .................................................................................................... Epoch: 100, loss:5.9442, mae:1.7259, mse:5.9442, val_loss:9.2284, val_mae:2.4212, val_mse:9.2284, .................................................................................................... Epoch: 200, loss:5.2884, mae:1.5659, mse:5.2884, val_loss:9.2644, val_mae:2.3198, val_mse:9.2644, .................................................................................................... Epoch: 300, loss:4.6195, mae:1.4223, mse:4.6195, val_loss:9.1833, val_mae:2.3297, val_mse:9.1833, .................................................................................................... Epoch: 400, loss:4.3211, mae:1.3692, mse:4.3211, val_loss:9.4166, val_mae:2.3007, val_mse:9.4166, .................................................................................................... Epoch: 500, loss:3.9686, mae:1.3067, mse:3.9686, val_loss:9.2298, val_mae:2.3665, val_mse:9.2298, .................................................................................................... Epoch: 600, loss:3.6994, mae:1.2568, mse:3.6994, val_loss:8.7640, val_mae:2.2697, val_mse:8.7640, .................................................................................................... Epoch: 700, loss:3.3848, mae:1.1908, mse:3.3848, val_loss:9.1926, val_mae:2.3320, val_mse:9.1926, .................................................................................................... Epoch: 800, loss:2.9672, mae:1.1176, mse:2.9672, val_loss:9.4478, val_mae:2.3530, val_mse:9.4478, .................................................................................................... Epoch: 900, loss:2.7960, mae:1.0726, mse:2.7960, val_loss:9.8546, val_mae:2.3783, val_mse:9.8546, ....................................................................................................

Visualize the model's training progress using the stats stored in the `history`

object.

```
hist = pd.DataFrame(history.history)
hist['epoch'] = history.epoch
hist.tail()
```

```
plotter = tfdocs.plots.HistoryPlotter(smoothing_std=2)
```

```
plotter.plot({'Basic': history}, metric = "mae")
plt.ylim([0, 10])
plt.ylabel('MAE [MPG]')
```

Text(0, 0.5, 'MAE [MPG]')

```
plotter.plot({'Basic': history}, metric = "mse")
plt.ylim([0, 20])
plt.ylabel('MSE [MPG^2]')
```

Text(0, 0.5, 'MSE [MPG^2]')

This graph shows little improvement, or even degradation in the validation error after about 100 epochs. Let's update the `model.fit`

call to automatically stop training when the validation score doesn't improve. We'll use an *EarlyStopping callback* that tests a training condition for every epoch. If a set amount of epochs elapses without showing improvement, then automatically stop the training.

You can learn more about this callback here.

```
model = build_model()
# The patience parameter is the amount of epochs to check for improvement
early_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=10)
early_history = model.fit(normed_train_data, train_labels,
epochs=EPOCHS, validation_split = 0.2, verbose=0,
callbacks=[early_stop, tfdocs.modeling.EpochDots()])
```

Epoch: 0, loss:554.9516, mae:22.2809, mse:554.9516, val_loss:543.1448, val_mae:21.9780, val_mse:543.1448, .........................................................................

```
plotter.plot({'Early Stopping': early_history}, metric = "mae")
plt.ylim([0, 10])
plt.ylabel('MAE [MPG]')
```

Text(0, 0.5, 'MAE [MPG]')

The graph shows that on the validation set, the average error is usually around +/- 2 MPG. Is this good? We'll leave that decision up to you.

Let's see how well the model generalizes by using the **test** set, which we did not use when training the model. This tells us how well we can expect the model to predict when we use it in the real world.

```
loss, mae, mse = model.evaluate(normed_test_data, test_labels, verbose=2)
print("Testing set Mean Abs Error: {:5.2f} MPG".format(mae))
```

3/3 - 0s - loss: 5.9308 - mae: 1.8407 - mse: 5.9308 Testing set Mean Abs Error: 1.84 MPG

### Make predictions

Finally, predict MPG values using data in the testing set:

```
test_predictions = model.predict(normed_test_data).flatten()
a = plt.axes(aspect='equal')
plt.scatter(test_labels, test_predictions)
plt.xlabel('True Values [MPG]')
plt.ylabel('Predictions [MPG]')
lims = [0, 50]
plt.xlim(lims)
plt.ylim(lims)
_ = plt.plot(lims, lims)
```

It looks like our model predicts reasonably well. Let's take a look at the error distribution.

```
error = test_predictions - test_labels
plt.hist(error, bins = 25)
plt.xlabel("Prediction Error [MPG]")
_ = plt.ylabel("Count")
```

It's not quite gaussian, but we might expect that because the number of samples is very small.

## Conclusion

This notebook introduced a few techniques to handle a regression problem.

- Mean Squared Error (MSE) is a common loss function used for regression problems (different loss functions are used for classification problems).
- Similarly, evaluation metrics used for regression differ from classification. A common regression metric is Mean Absolute Error (MAE).
- When numeric input data features have values with different ranges, each feature should be scaled independently to the same range.
- If there is not much training data, one technique is to prefer a small network with few hidden layers to avoid overfitting.
- Early stopping is a useful technique to prevent overfitting.

```
#
# Copyright (c) 2017 François Chollet
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
```