-
Install TensorFlow: pip install tensorflow.
-
Import TensorFlow: import tensorflow as tf.
-
Load and preprocess data: Normalizing data, extracting features, and splitting it into training and testing sets.
-
Build the ML model: Define the architecture of your model. model = tf.keras.Sequential([ tf.keras.layers.Dense(128, activation=‘relu’, input_shape=(input_dim,)), tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(10, activation=‘softmax’) )]
-
Compile the model: Configure the optimizer, loss function, and metrics. model.compile(optimizer=‘adam’, loss=‘categorical_crossentropy’, metrics=[‘accuracy’])
-
Train the model: Train the model on training data. model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_val, y_val))
-
Hyperparameter tuning: Adjust the number of epochs, batch size, learning rate, and other hyperparameters to improve the model.
-
Evaluate the Model: Once training is complete, evaluate the model. test_loss, test_accuracy = model.evaluate(X_test, y_test)
-
Make predictions: Use the trained model to make predictions on new data. predictions = model.predict(new_data)
CARVIEW |
A Beginner’s Guide to TensorFlow: Building Machine Learning Model
The “divide and conquer” rule can make solving big and complicated tasks a breeze. Imagine we have a big puzzle and access to good friends who are willing to help. Each friend can solve a small part of the puzzle, and these small parts are combined once finished. This approach solves the problem in smaller steps to obtain the final result. In terms of machine learning, solving a large task is quite a resource-consuming and time-taking task. TensorFlow breaks a complex machine learning task into smaller tasks with each task processed in parallel. This makes the overall process efficient.
TensorFlow is an open-source framework for implementing machine learning models. Additionally, the advent of deep learning-based neural networks has made TensorFlow more in demand in various domains like natural language processing, image, and audio recognition, etc. If you’re new to TensorFlow and want to train and evaluate ML and DNN models, you’re in the right place. This blog provides a step-by-step guide to TensorFlow on building machine learning (ML) and deep neural network (DNN) models using this powerful framework.
Getting started with TensorFlow#
A tensor is a multi-dimensional data structure representing the input, output, and intermediate data in TensorFlow computations. In TensorFlow, a graph represents a computation with nodes and edges. A node represents operations, and edges represent the data flow between these operations.
Building an ML model#
A machine learning model consists of a neural network with interconnected nodes organized into layers. Each node takes input from multiple nodes from the previous layer. Let’s take an example of a simple graph with an “add” operation as shown below:
In our simple example that depicts the “add” operation, layer one consists of two constants, a
and b
, and layer two consists of a single operation node c
. The edges represent a flow of data between nodes.
The first layer’s output becomes the second layer’s input, and this process continues until the final layer, which produces the network’s output.
Let’s see how we can create this simple graph in TensorFlow:
import tensorflow as tf# Create a graphgraph = tf.Graph()# Add operations to the graphwith graph.as_default():# Create nodes/operationsa = tf.constant(5, name='a')b = tf.constant(10, name='b')c = tf.add(a, b, name='c')# Run the graph in a sessionwith tf.Session(graph=graph) as sess:# Run the session to execute the operationsresult = sess.run(c)print("Result:", result)
- Line 1: Imports the
tensorflow
library. - Line 4: Creates a graph using the
tf.Graph()
function. - Lines 9–11: Defines two nodes
a
andb
using thetf.constant()
function to hold constant values of5
and10
, respectively. - Line 11: Creates the
c
node using thetf.add()
function to add values ofa
andb
. - Line 14: Creates a session using
tf.Session(graph=graph)
. - Line 16: Runs the session to execute the operations within the graph.
Creating a simple neural network#
Let’s build a simple neural network to implement an XOR logic gate with the following structure:
- An input layer with two nodes representing the two inputs of the XOR gate.
- A hidden layer with two nodes. This layer is called the hidden layer since it is not directly observable or accessible in terms of the input or output of the network.
- An output layer with one node representing the network output.
The input-output relation of an XOR logic gate is depicted in the table below:
XOR Gate: Input-Output Relation
Input 1 | Input 2 | Output |
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 0 |
Let’s build our neural network in TensorFlow:
import tensorflow as tfimport numpy as np# Step 1: Prepare the training datax_train = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32) # Input featuresy_train = np.array([[0], [1], [1], [0]], dtype=np.float32) # Target outputs# Step 2: Define the model architectureinput_dim = 2hidden_dim = 2output_dim = 1
- Lines 5–6: Defines the input-output relation for the XOR logic gate.
- Lines 9–11: Defines the number of nodes for input, hidden, and output layers.
The resulting graph looks as follows:
Assigning weights and bias#
Now that we have a basic structure of the XOR neural network, we start by assigning weights and biases to the network. Weights are numerical values associated with the graph edges and determine the weightage of that connection in influencing the output. Biases are parameters associated with each node, excluding the nodes in the input layer. These introduce an offset to the weighted sum of inputs to a node.
Let’s assign weights and biases to our neural network:
# Define placeholders for input and outputx = tf.placeholder(tf.float32, shape=[None, input_dim], name='x')y = tf.placeholder(tf.float32, shape=[None, output_dim], name='y')# Define variables for weights and biases of the hidden layerW_hidden = tf.Variable(tf.random_normal([input_dim, hidden_dim]), name='W_hidden')b_hidden = tf.Variable(tf.zeros([hidden_dim]), name='b_hidden')# Define variables for weights and biases of the output layerW_output = tf.Variable(tf.random_normal([hidden_dim, output_dim]), name='W_output')b_output = tf.Variable(tf.zeros([output_dim]), name='b_output')
-
Lines 6–7: Defines weights and biases of the hidden layer.
-
Lines 10–11: Defines weights and biases of the output layer.
Note: The initial weights assigned here are randomly chosen from a normal distribution, and all biases are zeros. We will see later how to update the values of weights and biases as we train our model.
The resultant network with weights and biases looks as follows:
As an example, the output of the first node in the hidden layer will be calculated as follows:
hidden 1=input 1×w1+input 2×w2+b1
Applying the activation function#
An activation function is applied to the weighted sum of inputs and biases of the node to produce an output of the node. An XOR logic gate is a non-linear operation, and we must introduce non-linearity into the model to make the network learn a non-linear XOR logic. Hence, we use an activation function to introduce non-linearity.
For the hidden layer, we will use a sigmoid function as an activation function that provides a smooth and continuous non-linear transformation, and it is mathematically written as follows:
S(hidden 1)=1+e−(hidden 1)1
For the output layer, we perform a linear transformation of the hidden layer outputs, weights, and bias as follows:
output=hidden 1×w5+hidden 2×w6+b3
Let’s apply these two activation functions:
# Step 3: Apply activation functions# Apply sigmoid activation to hidden layerhidden_layer = tf.nn.sigmoid(tf.matmul(x, W_hidden) + b_hidden)# Output layeroutput_layer = tf.matmul(hidden_layer, W_output) + b_output
-
Line 4: Applies the sigmoid activation function to the nodes in the hidden layer.
-
Line 6: Applies the linear transformation to calculate the network output.
Defining the loss function and optimizer#
Now that we have completed one pass of the network, we need to check if the network’s output matches the desired output. To do so, we define a loss function. A loss function measures the dissimilarity between the predicted and expected output. This helps us to quantify the model’s performance. We will use a mean squared error (MSE) as a loss function that calculates the average squared difference between the predicted and expected output values.
Based on the output of the loss function, we now need to update the model’s parameters, like weights and biases. We will use gradient descent optimization algorithms to find the optimum values of weights and biases. A gradient descent tunes the parameters in the direction of the steepest descent of the loss function.
Let’s implement the loss function and the optimization algorithm into our machine learning model:
# Step 4: Define the loss function and optimizerloss = tf.reduce_mean(tf.square(output_layer - y))optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1)train_op = optimizer.minimize(loss)
-
Line 2: Defines the loss function to reduce the mean squared error between the
output_layer
andy
. Herey
is the expected output andoutput_layer
is the predicted output. -
Lines 5–6: Defines the optimization algorithm with parameter
learning_rate
. This defines the step size taken during each parameter update. A larger value oflearning_rate
can result in faster convergence but also risks overshooting the optimal value. Similarly, a smaller value oflearning_rate
results in slow convergence but is expected to provide a more precise result.
Training the model#
Now that everything is in place, it’s time to train our model using the training date defined in x_train
and y_train
:
# Step 1: Prepare the training datax_train = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32) # Input featuresy_train = np.array([[0], [1], [1], [0]], dtype=np.float32) # Target outputs# Step 5: Train the modelnum_epochs = 1000batch_size = 4with tf.Session() as sess:sess.run(tf.global_variables_initializer()) # Initialize variablesfor epoch in range(num_epochs):# Generate random mini-batchesindices = np.random.choice(len(x_train), batch_size, replace=False)x_batch = x_train[indices]y_batch = y_train[indices]# Run optimization operation_, current_loss = sess.run([train_op, loss], feed_dict={x: x_batch, y: y_batch})
-
Lines 2–3: Defines the training data that represents the input-output relation for the XOR logic gate.
-
Lines 7–8: Defines the training parameters
num_epochs
andbatch_size
. -
Lines 13–17: Performs the training using training data by passing the data in batches of size
batch_size
to our model. This procedure is performednum_epochs
times.
Let’s visualize the training procedure using the following illustration:
Validating the model#
Now is the time to validate our model to check if the predicted output is the same as the expected output. We can input once again our training data and compare the model’s output to the expected output:
# Make predictionspredicted_output = sess.run(output_layer, feed_dict={x: x_train})print("Predicted outputs:", predicted_output)
The predicted output shows that our neural network has converged to the expected output of an XOR logic gate. The first and last predicted outputs are close to 0
and the second and third predicted outputs are close to 1
.
Next steps#
This blog has briefly introduced TensorFlow and how we can build a machine learning model using tensors.
We encourage you to explore more activation and loss functions and practice building more complex machine learning models. You can also check out the following courses on Educative to learn machine learning:
Applied Machine Learning: Industry Case Study with TensorFlow
In this course, you'll work on an industry-level machine learning project based on predicting weekly retail sales given different factors. You will learn the most efficient techniques used to train and evaluate scalable machine learning models. After completing this course, you will be able to take on industry-level machine learning projects, from data analysis to creating efficient models and providing results and insights. The code for this course is built around the TensorFlow framework, which is one of the premier frameworks for industry machine learning, and the Python pandas library for data analysis. Basic knowledge of Python and TensorFlow are prerequisites. To get some experience with TensorFlow, try our course: Machine Learning for Software Engineers. This course was created by AdaptiLab, a company specializing in evaluating, sourcing, and upskilling enterprise machine learning talent. It is built in collaboration with industry machine learning experts from Google, Microsoft, Amazon, and Apple.
Become a Machine Learning Engineer
Start your journey to becoming a machine learning engineer by mastering the fundamentals of coding with Python. Learn machine learning techniques, data manipulation, and visualization. As you progress, you'll explore object-oriented programming and the machine learning process, gaining hands-on experience with machine learning algorithms and tools like scikit-learn. Tackle practical projects, including predicting auto insurance payments and customer segmentation using K-means clustering. Finally, explore the deep learning models with convolutional neural networks and apply your skills to an AI-powered image colorization project.
Machine Learning with NumPy, pandas, scikit-learn, and More
If you're a software engineer looking to add machine learning to your skillset, this is the place to start. This course will teach you to write useful code and create impactful machine learning applications immediately. From the start, you'll be given all the tools that you need to create industry-level machine learning projects. Rather than reading through dense theory, you’ll learn practical skills and gain actionable insights. Topics covered include data analysis/visualization, feature engineering, supervised learning, unsupervised learning, and deep learning. All of these topics are taught using industry-standard frameworks: NumPy, pandas, scikit-learn, XGBoost, TensorFlow, and Keras. Basic knowledge of Python is a prerequisite to this course. This course was created by AdaptiLab, a company specializing in evaluating, sourcing, and upskilling enterprise machine learning talent. It is built in collaboration with industry machine learning experts from Google, Microsoft, Amazon, and Apple.
Frequently Asked Questions
How do you make a ML model using TensorFlow?
How do you make a ML model using TensorFlow?
Free Resources