Deploy Machine Learning(ML) Model ON Top of Docker
Let’s first understand some of the basic terminologies that we can use while doing this task so that we can understand it better.
What is Machine Learning(ML)?
Machine learning is a subset of Artificial Intelligence (AI) which provides machines the ability to learn automatically by feeding it tons of data & allowing it to improve through experience. Thus, Machine Learning is a practice of getting Machines to solve problems by gaining the ability to think.
How does Machine Learning Work?
The Machine Learning algorithm is trained using a training data set to create a model. When new input data is introduced to the ML algorithm, it makes a prediction on the basis of the model.
The prediction is evaluated for accuracy and if the accuracy is acceptable, the Machine Learning algorithm is deployed. If the accuracy is not acceptable, the Machine Learning algorithm is trained again and again with an augmented training data set.
Types Of Machine Learning
A machine can learn to solve a problem by following any one of the following three approaches:
Supervised learning is a technique in which we teach or train the machine using data that is well labeled.
To understand Supervised Learning let’s consider an analogy. As a kid we all needed guidance to solve math problems. Our teachers helped us understand what addition is and how it is done.
Similarly, we can think of supervised learning as a type of Machine Learning that involves a guide. The labeled data set is the teacher that will train us to understand patterns in the data. The labeled data set is nothing but the training data set.
So, in the above image, we’re feeding the machine images of Tom and Jerry and the goal is for the machine to identify and classify the images into two groups (Tom images and Jerry images).
The training data set that is fed to the model is labeled, as in, we’re telling the machine, ‘this is how Tom looks and this is Jerry’. By doing so we’re training the machine by using labeled data. In Supervised Learning, there is a well-defined training phase done with the help of labeled data.
Unsupervised learning involves training by using unlabeled data and allowing the model to act on that information without guidance.
Think of unsupervised learning as a smart kid that learns without any guidance. In this type of Machine Learning, the model is not fed with labeled data, as in the model has no clue that ‘this image is Tom and this is Jerry’, it figures out patterns and the differences between Tom and Jerry on its own by taking in tons of data.
For example, it identifies prominent features of Tom such as pointy ears, bigger size, etc, to understand that this image is of type 1. Similarly, it finds such features in Jerry and knows that this image is of type 2.
Therefore, it classifies the images into two different classes without knowing who Tom is or Jerry is.
Reinforcement Learning is a part of Machine learning where an agent is put in an environment and he learns to behave in this environment by performing certain actions and observing the rewards which it gets from those actions.
To understand this let’s imagine that we were dropped off at an isolated island! Initially, we all would be panic but as time passes by, we will learn how to live on the island. we will explore the environment, understand the climate condition, the type of food that grows there, the dangers of the island, etc.
This is exactly how Reinforcement Learning works, it involves an Agent (we, stuck on the island) that is put in an unknown environment (island), where he must learn by observing and performing actions that result in rewards.
Reinforcement Learning is mainly used in advanced Machine Learning areas such as self-driving cars, AplhaGo, etc.
What Problems Can Machine Learning Solve?
There are three main categories of problems that can be solved using Machine Learning:
In this type of problem, the output is a continuous quantity. For example, if we want to predict the speed of a car given the distance, it is a Regression problem. Regression problems can be solved by using Supervised Learning algorithms like Linear Regression.
In this type, the output is a categorical value. Classifying emails into two classes, spam and non-spam is a classification problem that can be solved by using Supervised Learning classification algorithms such as Support Vector Machines, Naive Bayes, Logistic Regression, K Nearest Neighbor, etc.
What is Clustering?
This type of problem involves assigning the input into two or more clusters based on feature similarity. For example, clustering viewers into similar groups based on their interests, age, geography, etc can be done by using Unsupervised Learning algorithms like K-Means Clustering.
What is Docker?
Docker is an open platform for developing, shipping, and running applications by using containers. Docker enables us to separate our applications from our infrastructure so we can deliver software quickly. With Docker, we can manage our infrastructure in the same ways as we manage our applications. So, by taking advantage of Docker’s methodologies for shipping, testing, and deploying code quickly, we can significantly reduce the delay between writing code and running it in production.
or in a way, we can say that Docker is a bit like a virtual machine. But unlike a virtual machine, rather than creating a whole virtual operating system, it allows applications to use the same Linux kernel as the system that they’re running on and only requires applications to be shipped with things not already running on the host computer. This gives a significant performance boost and reduces the size of the application.
What is Container?
Containers offer a logical packaging mechanism in which applications can be abstracted from the environment in which they actually run. This decoupling allows container-based applications to be deployed easily and consistently, regardless of whether the target environment is a private data centre, the public cloud, or even a developer’s personal laptop. Containerization provides a clean separation of concerns, as developers focus on their application logic and dependencies.
For those coming from virtualized environments, containers are often compared with virtual machines (VMs). We might already be familiar with VMs: a guest operating system such as Linux or Windows runs on top of a host operating system with virtualized access to the underlying hardware. Like virtual machines, containers allow us to package our application together with libraries and other dependencies, providing isolated environments for running our software services.
As we see that below in the image, the similarities end here as containers offer a far more lightweight unit as compare to VMs for developers and IT teams to work with, carrying a lot of benefits.
What is Docker Image?
A Docker image is a read-only template that contains a set of instructions for creating a container that can run on the Docker platform. It includes the elements needed to run an application as a container — such as code, config files, environment variables, libraries, and run time.
It provides a convenient way to package up applications and pre-configured server environments, which we can use for our own private use or share publicly with other Docker users. If the image is deployed to a Docker environment it can then be executed as a Docker container. The docker run command will create a container from a given image.
Docker images are also reusable assets that can be deployed on any host.
Now, lets Start the Practical:
Task Description 📄
👉 Pull the Docker container image of CentOS image from Docker Hub and create a new container
👉 Install the Python software on the top of docker container
👉 In Container you need to copy/create machine learning model which you have created in jupyter notebook
First of all, we need to make sure that Docker is installed on our system. For the following task, we’ll assume that the Docker Community Edition (CE) is installed.
Docker CE is available for all major platforms including macOS, Windows, and Linux. But I’m gonna use Linux OS as my Docker Host i.e. Means on the top of RedHat Linux 8 system I will install Docker Software. The specific steps needed to install Docker CE on Linux System can be found at https://www.linuxtechi.com/install-docker-ce-centos-8-rhel-8/
Furthermore, you should make sure to create a free account at https://hub.docker.com, so that you can use this account to sign in to the Docker Desktop application.
Once Docker is installed and running on our system we’re able to start by entering the following command on the terminal:
✒️ Command to see the version of docker : docker version
✒️ Command to see the detail info. about the docker container : docker info
Start the docker service
Command use to start service : systemctl start docker
Command use to see status of our service : systemctl status docker
To make the service started permanent in host os i.e., every time we start the host os and our service is running so for this use the command: systemctl enable docker
Pull the Docker Image from the Docker Hub
For the list of already existing Docker images go to hub.docker.com
For this, we will be using Docker Image of Centos, Version 7. You can also use Centos Docker Image.
Command to see the list of Docker Images present in your local machine : docker images
Now, we have to pull the image from the docker hub so that we can launch our container from that image. So for this, we use docker pull image_name:version
Now, we have to launch an container from the above download image i.e. centos:7 and for this we use the command : docker run -it centos:7
Install the Python software on the top of docker container
Step 1: Run command yum install python3, for installing the python on top of the container.
Now, to confirm that python is successfully installed or not run the command python3.
Create Machine Learning Model
As, in the starting we see that there are three main categories of problems that can be solved using Machine Learning:
So, today we will be solving Regression problem or creating the machine learning model using linear regression.
Now, let first understand What is Linear Regression?
Linear regression is one of the easiest and most popular Machine Learning algorithms. It is a statistical method that is commonly used type of predictive analysis. Linear regression makes predictions for continuous/real or numeric variables such as sales, salary, age, product price, etc.
The overall idea of regression is to examine two things:
- Does a set of predictor variables do a good job in predicting an outcome (dependent) variable?
- Which variables in particular are significant predictors of the outcome variable, and in what way do they–indicated by the magnitude and sign of the beta estimates–impact the outcome variable?
These regression estimates are used to explain the relationship between one dependent variable and one or more independent variables.
The simplest form of the regression equation with one dependent and one independent variable is Linear Regression i.e. defined by the formula y = b+w*x
where, y = estimated dependent variable or Target Variable,
w= coefficient of x/weight,
x = score on the independent variable or predictor Variable.
✒️Copying the model inside the container: docker cp source_file container_name:destination_file
Now, we also have to Install some of the depended libraries inside container that are give below
pip3 install numpy
pip3 install pandas
pip3 install scikit-learn
pip3 install joblib
Here, we can see that we are successfully deploy our ML model inside the container .
I have successfully deploy our ML model on top of Docker Container.
In the upcoming days I am going to publish lots of articles on different topics related to ML/DL , and other tools &Technologies, So definitely follow me on Medium.
Here is my LinkedIn profile if u have any queries definitely comment below or DM me on Linkedin