HTTP/2 200
content-type: application/octet-stream
x-guploader-uploadid: ABgVH88JU6PVzw1-AvLNM5Y8cGU1tXudDaZaFfJqQtIpxyeqbw3E6yDjspJkiYv9XsnaTUo85v9nyQo
expires: Sat, 26 Jul 2025 00:09:39 GMT
date: Fri, 25 Jul 2025 23:09:39 GMT
cache-control: public, max-age=3600
last-modified: Fri, 12 Jul 2024 09:15:37 GMT
etag: "b5c80b48292added4b3bc1e840bbebef"
x-goog-generation: 1720775737058028
x-goog-metageneration: 1
x-goog-stored-content-encoding: identity
x-goog-stored-content-length: 39726
x-goog-hash: crc32c=9lDjbA==
x-goog-hash: md5=tcgLSCkq3e1LO8HoQLvr7w==
x-goog-storage-class: MULTI_REGIONAL
accept-ranges: bytes
content-length: 39726
server: UploadServer
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "FhGuhbZ6M5tl"
},
"source": [
"##### Copyright 2018 The TensorFlow Authors."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "AwOEIRJC6Une"
},
"outputs": [],
"source": [
"#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# https://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "KyPEtTqk6VdG"
},
"outputs": [],
"source": [
"#@title MIT License\n",
"#\n",
"# Copyright (c) 2017 François Chollet\n",
"#\n",
"# Permission is hereby granted, free of charge, to any person obtaining a\n",
"# copy of this software and associated documentation files (the \"Software\"),\n",
"# to deal in the Software without restriction, including without limitation\n",
"# the rights to use, copy, modify, merge, publish, distribute, sublicense,\n",
"# and/or sell copies of the Software, and to permit persons to whom the\n",
"# Software is furnished to do so, subject to the following conditions:\n",
"#\n",
"# The above copyright notice and this permission notice shall be included in\n",
"# all copies or substantial portions of the Software.\n",
"#\n",
"# THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n",
"# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n",
"# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL\n",
"# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n",
"# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING\n",
"# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER\n",
"# DEALINGS IN THE SOFTWARE."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "EIdT9iu_Z4Rb"
},
"source": [
"# Basic regression: Predict fuel efficiency"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "bBIlTPscrIT9"
},
"source": [
"
"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "AHp3M9ZmrIxj"
},
"source": [
"In a *regression* problem, the aim is to predict the output of a continuous value, like a price or a probability. Contrast this with a *classification* problem, where the aim is to select a class from a list of classes (for example, where a picture contains an apple or an orange, recognizing which fruit is in the picture).\n",
"\n",
"This tutorial uses the classic [Auto MPG](https://archive.ics.uci.edu/ml/datasets/auto+mpg) dataset and demonstrates how to build models to predict the fuel efficiency of the late-1970s and early 1980s automobiles. To do this, you will provide the models with a description of many automobiles from that time period. This description includes attributes like cylinders, displacement, horsepower, and weight.\n",
"\n",
"This example uses the Keras API. (Visit the Keras [tutorials](https://www.tensorflow.org/tutorials/keras) and [guides](https://www.tensorflow.org/guide/keras) to learn more.)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "moB4tpEHxKB3"
},
"outputs": [],
"source": [
"# Use seaborn for pairplot.\n",
"!pip install -q seaborn"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "1rRo8oNqZ-Rj"
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"\n",
"# Make NumPy printouts easier to read.\n",
"np.set_printoptions(precision=3, suppress=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "9xQKvCJ85kCQ"
},
"outputs": [],
"source": [
"import tensorflow as tf\n",
"\n",
"from tensorflow import keras\n",
"from tensorflow.keras import layers\n",
"\n",
"print(tf.__version__)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "F_72b0LCNbjx"
},
"source": [
"## The Auto MPG dataset\n",
"\n",
"The dataset is available from the [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/).\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gFh9ne3FZ-On"
},
"source": [
"### Get the data\n",
"First download and import the dataset using pandas:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "CiX2FI4gZtTt"
},
"outputs": [],
"source": [
"url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data'\n",
"column_names = ['MPG', 'Cylinders', 'Displacement', 'Horsepower', 'Weight',\n",
" 'Acceleration', 'Model Year', 'Origin']\n",
"\n",
"raw_dataset = pd.read_csv(url, names=column_names,\n",
" na_values='?', comment='\\t',\n",
" sep=' ', skipinitialspace=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "2oY3pMPagJrO"
},
"outputs": [],
"source": [
"dataset = raw_dataset.copy()\n",
"dataset.tail()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3MWuJTKEDM-f"
},
"source": [
"### Clean the data\n",
"\n",
"The dataset contains a few unknown values:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "JEJHhN65a2VV"
},
"outputs": [],
"source": [
"dataset.isna().sum()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9UPN0KBHa_WI"
},
"source": [
"Drop those rows to keep this initial tutorial simple:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "4ZUDosChC1UN"
},
"outputs": [],
"source": [
"dataset = dataset.dropna()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8XKitwaH4v8h"
},
"source": [
"The `\"Origin\"` column is categorical, not numeric. So the next step is to one-hot encode the values in the column with [pd.get_dummies](https://pandas.pydata.org/docs/reference/api/pandas.get_dummies.html).\n",
"\n",
"Note: You can set up the `tf.keras.Model` to do this kind of transformation for you but that's beyond the scope of this tutorial. Check out the [Classify structured data using Keras preprocessing layers](../structured_data/preprocessing_layers.ipynb) or [Load CSV data](../load_data/csv.ipynb) tutorials for examples."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "gWNTD2QjBWFJ"
},
"outputs": [],
"source": [
"dataset['Origin'] = dataset['Origin'].map({1: 'USA', 2: 'Europe', 3: 'Japan'})"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "ulXz4J7PAUzk"
},
"outputs": [],
"source": [
"dataset = pd.get_dummies(dataset, columns=['Origin'], prefix='', prefix_sep='')\n",
"dataset.tail()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Cuym4yvk76vU"
},
"source": [
"### Split the data into training and test sets\n",
"\n",
"Now, split the dataset into a training set and a test set. You will use the test set in the final evaluation of your models."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "qn-IGhUE7_1H"
},
"outputs": [],
"source": [
"train_dataset = dataset.sample(frac=0.8, random_state=0)\n",
"test_dataset = dataset.drop(train_dataset.index)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "J4ubs136WLNp"
},
"source": [
"### Inspect the data\n",
"\n",
"Review the joint distribution of a few pairs of columns from the training set.\n",
"\n",
"The top row suggests that the fuel efficiency (MPG) is a function of all the other parameters. The other rows indicate they are functions of each other."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "oRKO_x8gWKv-"
},
"outputs": [],
"source": [
"sns.pairplot(train_dataset[['MPG', 'Cylinders', 'Displacement', 'Weight']], diag_kind='kde')"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gavKO_6DWRMP"
},
"source": [
"Let's also check the overall statistics. Note how each feature covers a very different range:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "yi2FzC3T21jR"
},
"outputs": [],
"source": [
"train_dataset.describe().transpose()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Db7Auq1yXUvh"
},
"source": [
"### Split features from labels\n",
"\n",
"Separate the target value—the \"label\"—from the features. This label is the value that you will train the model to predict."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "t2sluJdCW7jN"
},
"outputs": [],
"source": [
"train_features = train_dataset.copy()\n",
"test_features = test_dataset.copy()\n",
"\n",
"train_labels = train_features.pop('MPG')\n",
"test_labels = test_features.pop('MPG')"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "mRklxK5s388r"
},
"source": [
"## Normalization\n",
"\n",
"In the table of statistics it's easy to see how different the ranges of each feature are:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "IcmY6lKKbkw8"
},
"outputs": [],
"source": [
"train_dataset.describe().transpose()[['mean', 'std']]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-ywmerQ6dSox"
},
"source": [
"It is good practice to normalize features that use different scales and ranges.\n",
"\n",
"One reason this is important is because the features are multiplied by the model weights. So, the scale of the outputs and the scale of the gradients are affected by the scale of the inputs.\n",
"\n",
"Although a model *might* converge without feature normalization, normalization makes training much more stable.\n",
"\n",
"Note: There is no advantage to normalizing the one-hot features—it is done here for simplicity. For more details on how to use the preprocessing layers, refer to the [Working with preprocessing layers](https://www.tensorflow.org/guide/keras/preprocessing_layers) guide and the [Classify structured data using Keras preprocessing layers](../structured_data/preprocessing_layers.ipynb) tutorial."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "aFJ6ISropeoo"
},
"source": [
"### The Normalization layer\n",
"\n",
"The `tf.keras.layers.Normalization` is a clean and simple way to add feature normalization into your model.\n",
"\n",
"The first step is to create the layer:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "JlC5ooJrgjQF"
},
"outputs": [],
"source": [
"normalizer = tf.keras.layers.Normalization(axis=-1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "XYA2Ap6nVOha"
},
"source": [
"Then, fit the state of the preprocessing layer to the data by calling `Normalization.adapt`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "CrBbbjbwV91f"
},
"outputs": [],
"source": [
"normalizer.adapt(np.array(train_features))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "oZccMR5yV9YV"
},
"source": [
"Calculate the mean and variance, and store them in the layer:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "GGn-ukwxSPtx"
},
"outputs": [],
"source": [
"print(normalizer.mean.numpy())"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "oGWKaF9GSRuN"
},
"source": [
"When the layer is called, it returns the input data, with each feature independently normalized:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "2l7zFL_XWIRu"
},
"outputs": [],
"source": [
"first = np.array(train_features[:1])\n",
"\n",
"with np.printoptions(precision=2, suppress=True):\n",
" print('First example:', first)\n",
" print()\n",
" print('Normalized:', normalizer(first).numpy())"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6o3CrycBXA2s"
},
"source": [
"## Linear regression\n",
"\n",
"Before building a deep neural network model, start with linear regression using one and several variables."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "lFby9n0tnHkw"
},
"source": [
"### Linear regression with one variable\n",
"\n",
"Begin with a single-variable linear regression to predict `'MPG'` from `'Horsepower'`.\n",
"\n",
"Training a model with `tf.keras` typically starts by defining the model architecture. Use a `tf.keras.Sequential` model, which [represents a sequence of steps](https://www.tensorflow.org/guide/keras/sequential_model).\n",
"\n",
"There are two steps in your single-variable linear regression model:\n",
"\n",
"- Normalize the `'Horsepower'` input features using the `tf.keras.layers.Normalization` preprocessing layer.\n",
"- Apply a linear transformation ($y = mx+b$) to produce 1 output using a linear layer (`tf.keras.layers.Dense`).\n",
"\n",
"The number of _inputs_ can either be set by the `input_shape` argument, or automatically when the model is run for the first time."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Xp3gAFn3TPv8"
},
"source": [
"First, create a NumPy array made of the `'Horsepower'` features. Then, instantiate the `tf.keras.layers.Normalization` and fit its state to the `horsepower` data:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "1gJAy0fKs1TS"
},
"outputs": [],
"source": [
"horsepower = np.array(train_features['Horsepower'])\n",
"\n",
"horsepower_normalizer = layers.Normalization(input_shape=[1,], axis=None)\n",
"horsepower_normalizer.adapt(horsepower)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "4NVlHJY2TWlC"
},
"source": [
"Build the Keras Sequential model:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "c0sXM7qLlKfZ"
},
"outputs": [],
"source": [
"horsepower_model = tf.keras.Sequential([\n",
" horsepower_normalizer,\n",
" layers.Dense(units=1)\n",
"])\n",
"\n",
"horsepower_model.summary()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "eObQu9fDnXGL"
},
"source": [
"This model will predict `'MPG'` from `'Horsepower'`.\n",
"\n",
"Run the untrained model on the first 10 'Horsepower' values. The output won't be good, but notice that it has the expected shape of `(10, 1)`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "UfV1HS6bns-s"
},
"outputs": [],
"source": [
"horsepower_model.predict(horsepower[:10])"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "CSkanJlmmFBX"
},
"source": [
"Once the model is built, configure the training procedure using the Keras `Model.compile` method. The most important arguments to compile are the `loss` and the `optimizer`, since these define what will be optimized (`mean_absolute_error`) and how (using the `tf.keras.optimizers.Adam`)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "JxA_3lpOm-SK"
},
"outputs": [],
"source": [
"horsepower_model.compile(\n",
" optimizer=tf.keras.optimizers.Adam(learning_rate=0.1),\n",
" loss='mean_absolute_error')"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Z3q1I9TwnRSC"
},
"source": [
"Use Keras `Model.fit` to execute the training for 100 epochs:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "-iSrNy59nRAp"
},
"outputs": [],
"source": [
"%%time\n",
"history = horsepower_model.fit(\n",
" train_features['Horsepower'],\n",
" train_labels,\n",
" epochs=100,\n",
" # Suppress logging.\n",
" verbose=0,\n",
" # Calculate validation results on 20% of the training data.\n",
" validation_split = 0.2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "tQm3pc0FYPQB"
},
"source": [
"Visualize the model's training progress using the stats stored in the `history` object:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "YCAwD_y4AdC3"
},
"outputs": [],
"source": [
"hist = pd.DataFrame(history.history)\n",
"hist['epoch'] = history.epoch\n",
"hist.tail()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "9E54UoZunqhc"
},
"outputs": [],
"source": [
"def plot_loss(history):\n",
" plt.plot(history.history['loss'], label='loss')\n",
" plt.plot(history.history['val_loss'], label='val_loss')\n",
" plt.ylim([0, 10])\n",
" plt.xlabel('Epoch')\n",
" plt.ylabel('Error [MPG]')\n",
" plt.legend()\n",
" plt.grid(True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "yYsQYrIZyqjz"
},
"outputs": [],
"source": [
"plot_loss(history)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "CMNrt8X2ebXd"
},
"source": [
"Collect the results on the test set for later:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "kDZ8EvNYrDtx"
},
"outputs": [],
"source": [
"test_results = {}\n",
"\n",
"test_results['horsepower_model'] = horsepower_model.evaluate(\n",
" test_features['Horsepower'],\n",
" test_labels, verbose=0)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "F0qutYAKwoda"
},
"source": [
"Since this is a single variable regression, it's easy to view the model's predictions as a function of the input:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "xDS2JEtOn9Jn"
},
"outputs": [],
"source": [
"x = tf.linspace(0.0, 250, 251)\n",
"y = horsepower_model.predict(x)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "rttFCTU8czsI"
},
"outputs": [],
"source": [
"def plot_horsepower(x, y):\n",
" plt.scatter(train_features['Horsepower'], train_labels, label='Data')\n",
" plt.plot(x, y, color='k', label='Predictions')\n",
" plt.xlabel('Horsepower')\n",
" plt.ylabel('MPG')\n",
" plt.legend()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "7l9ZiAOEUNBL"
},
"outputs": [],
"source": [
"plot_horsepower(x, y)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Yk2RmlqPoM9u"
},
"source": [
"### Linear regression with multiple inputs"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "PribnwDHUksC"
},
"source": [
"You can use an almost identical setup to make predictions based on multiple inputs. This model still does the same $y = mx+b$ except that $m$ is a matrix and $x$ is a vector.\n",
"\n",
"Create a two-step Keras Sequential model again with the first layer being `normalizer` (`tf.keras.layers.Normalization(axis=-1)`) you defined earlier and adapted to the whole dataset:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "ssnVcKg7oMe6"
},
"outputs": [],
"source": [
"linear_model = tf.keras.Sequential([\n",
" normalizer,\n",
" layers.Dense(units=1)\n",
"])"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "IHlx6WeIWyAr"
},
"source": [
"When you call `Model.predict` on a batch of inputs, it produces `units=1` outputs for each example:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "DynfJV18WiuT"
},
"outputs": [],
"source": [
"linear_model.predict(train_features[:10])"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hvHKH3rPXHmq"
},
"source": [
"When you call the model, its weight matrices will be built—check that the `kernel` weights (the $m$ in $y=mx+b$) have a shape of `(9, 1)`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "DwJ4Fq0RXBQf"
},
"outputs": [],
"source": [
"linear_model.layers[1].kernel"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "eINAc6rZXzOt"
},
"source": [
"Configure the model with Keras `Model.compile` and train with `Model.fit` for 100 epochs:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "A0Sv_Ybr0szp"
},
"outputs": [],
"source": [
"linear_model.compile(\n",
" optimizer=tf.keras.optimizers.Adam(learning_rate=0.1),\n",
" loss='mean_absolute_error')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "EZoOYORvoTSe"
},
"outputs": [],
"source": [
"%%time\n",
"history = linear_model.fit(\n",
" train_features,\n",
" train_labels,\n",
" epochs=100,\n",
" # Suppress logging.\n",
" verbose=0,\n",
" # Calculate validation results on 20% of the training data.\n",
" validation_split = 0.2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "EdxiCbiNYK2F"
},
"source": [
"Using all the inputs in this regression model achieves a much lower training and validation error than the `horsepower_model`, which had one input:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "4sWO3W0koYgu"
},
"outputs": [],
"source": [
"plot_loss(history)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "NyN49hIWe_NH"
},
"source": [
"Collect the results on the test set for later:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "jNC3D1DGsGgK"
},
"outputs": [],
"source": [
"test_results['linear_model'] = linear_model.evaluate(\n",
" test_features, test_labels, verbose=0)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "SmjdzxKzEu1-"
},
"source": [
"## Regression with a deep neural network (DNN)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "DT_aHPsrzO1t"
},
"source": [
"In the previous section, you implemented two linear models for single and multiple inputs.\n",
"\n",
"Here, you will implement single-input and multiple-input DNN models.\n",
"\n",
"The code is basically the same except the model is expanded to include some \"hidden\" non-linear layers. The name \"hidden\" here just means not directly connected to the inputs or outputs."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6SWtkIjhrZwa"
},
"source": [
"These models will contain a few more layers than the linear model:\n",
"\n",
"* The normalization layer, as before (with `horsepower_normalizer` for a single-input model and `normalizer` for a multiple-input model).\n",
"* Two hidden, non-linear, `Dense` layers with the ReLU (`relu`) activation function nonlinearity.\n",
"* A linear `Dense` single-output layer.\n",
"\n",
"Both models will use the same training procedure, so the `compile` method is included in the `build_and_compile_model` function below."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "c26juK7ZG8j-"
},
"outputs": [],
"source": [
"def build_and_compile_model(norm):\n",
" model = keras.Sequential([\n",
" norm,\n",
" layers.Dense(64, activation='relu'),\n",
" layers.Dense(64, activation='relu'),\n",
" layers.Dense(1)\n",
" ])\n",
"\n",
" model.compile(loss='mean_absolute_error',\n",
" optimizer=tf.keras.optimizers.Adam(0.001))\n",
" return model"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6c51caebbc0d"
},
"source": [
"### Regression using a DNN and a single input"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "xvu9gtxTZR5V"
},
"source": [
"Create a DNN model with only `'Horsepower'` as input and `horsepower_normalizer` (defined earlier) as the normalization layer:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "cGbPb-PHGbhs"
},
"outputs": [],
"source": [
"dnn_horsepower_model = build_and_compile_model(horsepower_normalizer)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Sj49Og4YGULr"
},
"source": [
"This model has quite a few more trainable parameters than the linear models:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "ReAD0n6MsFK-"
},
"outputs": [],
"source": [
"dnn_horsepower_model.summary()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "0-qWCsh6DlyH"
},
"source": [
"Train the model with Keras `Model.fit`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "sD7qHCmNIOY0"
},
"outputs": [],
"source": [
"%%time\n",
"history = dnn_horsepower_model.fit(\n",
" train_features['Horsepower'],\n",
" train_labels,\n",
" validation_split=0.2,\n",
" verbose=0, epochs=100)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dArGGxHxcKjN"
},
"source": [
"This model does slightly better than the linear single-input `horsepower_model`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "NcF6UWjdCU8T"
},
"outputs": [],
"source": [
"plot_loss(history)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "TG1snlpR2QCK"
},
"source": [
"If you plot the predictions as a function of `'Horsepower'`, you should notice how this model takes advantage of the nonlinearity provided by the hidden layers:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "hPF53Rem14NS"
},
"outputs": [],
"source": [
"x = tf.linspace(0.0, 250, 251)\n",
"y = dnn_horsepower_model.predict(x)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "rsf9rD8I17Wq"
},
"outputs": [],
"source": [
"plot_horsepower(x, y)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "WxCJKIUpe4io"
},
"source": [
"Collect the results on the test set for later:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "bJjM0dU52XtN"
},
"outputs": [],
"source": [
"test_results['dnn_horsepower_model'] = dnn_horsepower_model.evaluate(\n",
" test_features['Horsepower'], test_labels,\n",
" verbose=0)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "S_2Btebp2e64"
},
"source": [
"### Regression using a DNN and multiple inputs"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "aKFtezDldLSf"
},
"source": [
"Repeat the previous process using all the inputs. The model's performance slightly improves on the validation dataset."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "c0mhscXh2k36"
},
"outputs": [],
"source": [
"dnn_model = build_and_compile_model(normalizer)\n",
"dnn_model.summary()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "CXDENACl2tuW"
},
"outputs": [],
"source": [
"%%time\n",
"history = dnn_model.fit(\n",
" train_features,\n",
" train_labels,\n",
" validation_split=0.2,\n",
" verbose=0, epochs=100)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "-9Dbj0fX23RQ"
},
"outputs": [],
"source": [
"plot_loss(history)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hWoVYS34fJPZ"
},
"source": [
"Collect the results on the test set:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "-bZIa96W3c7K"
},
"outputs": [],
"source": [
"test_results['dnn_model'] = dnn_model.evaluate(test_features, test_labels, verbose=0)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "uiCucdPLfMkZ"
},
"source": [
"## Performance"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "rDf1xebEfWBw"
},
"source": [
"Since all models have been trained, you can review their test set performance:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "e5_ooufM5iH2"
},
"outputs": [],
"source": [
"pd.DataFrame(test_results, index=['Mean absolute error [MPG]']).T"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "DABIVzsCf-QI"
},
"source": [
"These results match the validation error observed during training."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ft603OzXuEZC"
},
"source": [
"### Make predictions\n",
"\n",
"You can now make predictions with the `dnn_model` on the test set using Keras `Model.predict` and review the loss:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Xe7RXH3N3CWU"
},
"outputs": [],
"source": [
"test_predictions = dnn_model.predict(test_features).flatten()\n",
"\n",
"a = plt.axes(aspect='equal')\n",
"plt.scatter(test_labels, test_predictions)\n",
"plt.xlabel('True Values [MPG]')\n",
"plt.ylabel('Predictions [MPG]')\n",
"lims = [0, 50]\n",
"plt.xlim(lims)\n",
"plt.ylim(lims)\n",
"_ = plt.plot(lims, lims)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "19wyogbOSU5t"
},
"source": [
"It appears that the model predicts reasonably well.\n",
"\n",
"Now, check the error distribution:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "f-OHX4DiXd8x"
},
"outputs": [],
"source": [
"error = test_predictions - test_labels\n",
"plt.hist(error, bins=25)\n",
"plt.xlabel('Prediction Error [MPG]')\n",
"_ = plt.ylabel('Count')"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "KSyaHUfDT-mZ"
},
"source": [
"If you're happy with the model, save it for later use with `Model.save`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "4-WwLlmfT-mb"
},
"outputs": [],
"source": [
"dnn_model.save('dnn_model.keras')"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Benlnl8UT-me"
},
"source": [
"If you reload the model, it gives identical output:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "dyyyj2zVT-mf"
},
"outputs": [],
"source": [
"reloaded = tf.keras.models.load_model('dnn_model.keras')\n",
"\n",
"test_results['reloaded'] = reloaded.evaluate(\n",
" test_features, test_labels, verbose=0)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "f_GchJ2tg-2o"
},
"outputs": [],
"source": [
"pd.DataFrame(test_results, index=['Mean absolute error [MPG]']).T"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "vgGQuV-yqYZH"
},
"source": [
"## Conclusion\n",
"\n",
"This notebook introduced a few techniques to handle a regression problem. Here are a few more tips that may help:\n",
"\n",
"- Mean squared error (MSE) (`tf.keras.losses.MeanSquaredError`) and mean absolute error (MAE) (`tf.keras.losses.MeanAbsoluteError`) are common loss functions used for regression problems. MAE is less sensitive to outliers. Different loss functions are used for classification problems.\n",
"- Similarly, evaluation metrics used for regression differ from classification.\n",
"- When numeric input data features have values with different ranges, each feature should be scaled independently to the same range.\n",
"- Overfitting is a common problem for DNN models, though it wasn't a problem for this tutorial. Visit the [Overfit and underfit](overfit_and_underfit.ipynb) tutorial for more help with this."
]
}
],
"metadata": {
"colab": {
"collapsed_sections": [],
"name": "regression.ipynb",
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}