Mark As Completed Discussion

Introduction

In this lesson, we will build our first machine learning algorithm using NumPy. You will also learn about several types of classes and common keywords in machine learning. By the end of the lesson, you will begin to understand what machine learning is really like.

Intuition

Let's build a program that guesses what number you're thinking of simply by correcting the output of the program. The computer will guess some number and show it to you. You will then tell the computer if the number you're thinking of is higher or lower. The computer will guess again and repeat the process. As this process repeats, the computer will slowly become more and more accurate until it guesses the correct number. Now that you understand the intuition behind the program, we'll start programming step by step.

First, the computer will guess a number. We can easily do this by using the built-in random module in Python. We will assume that the number will always be between 0 and 100, and we will keep track of the number of guesses the computer has produced before guessing the correct number. This variable, called tries, will start at 0.

PYTHON
1import random
2
3guess = random.randint(0, 100)
4tries = 0

Now we will build the skeleton of the program. The computer will keep guessing until it guesses the correct number. We can implement this by creating an infinite loop and then ask the user if the guess is correct. If it is, then we will break out of the loop. We also need code that accounts for if the guess is incorrect. These blocks of code will tell the computer if the correct number is higher or lower.

PYTHON
1import random
2
3guess = random.randint(0, 100)
4tries = 0
5
6# This is known as the training loop in ML since we're updating our parameter "guess" in this loop
7while True:
8    tries += 1
9    print("My guess is: ", guess)
10    print('''Is this guess correct?
11    1: Yes!
12    2: The number is higher
13    3: The number is lower
14
15        : 
16    ''', end='')
17    comp = int(input())
18    # Input validation
19    if comp not in [1, 2, 3]:
20        print("Input is invalid")
21        continue
22    
23    if comp == 1:
24        print("I did it!")
25        print("Total tries: ", tries)
26        break
27    elif comp == 2:
28        # Guess a higher number
29    elif comp == 3:
30        # Guess a lower number

We only need the logic part of updating the guess variable to complete the program. We could simply increment/decrement guess by one, but this would be very tedious. If we implemented this code, then the computer would require many tries. We can improve the program by incrementing the guess by a greater value. Let's call this value jump (since the guess is jumping towards the answer).

PYTHON
1jump = 20
2
3# to decrement
4guess -= jump
5
6# to increment
7guess += jump

Notice, however, if we keep jump fixed, then something will go very wrong during the game. Can you see the problem? The guess will never get close to the actual answer! We somehow need to decrement the value of jump throughout the iterations. However, if we only need to jump in a positive direction to get to the answer, then there is no need to decrease the jump length since this will just slow the convergence process.

So what can we do here? The solution is to keep track of the jump direction. If the jump direction changes, then we know that we jumped too far. In this case, we would need to jump back in the other direction at a smaller value. Let's see it in code:

PYTHON
1import random
2
3guess = random.randint(0, 100)
4tries = 0
5jump = 20
6last_comp = 0 # Keeps track of the direction
7decrement_jump = 2
8
9while True:
10    # 80
11    tries += 1
12    print("My guess is: ", guess)
13    print('''
14    
15    Is this guess correct?
16    1: Yes!
17    2: The number is higher
18    3: The number is lower
19
20        : ''', end='')
21    comp = int(input())
22    if comp not in [1, 2, 3]:
23        print("Input is invalid")
24        continue
25    
26    # Oops, we jumped too far. So decrement jump
27    if last_comp != comp:
28        jump /= decrement_jump
29            
30    if comp == 1:
31        print("I did it!")
32        print("Total tries: ", tries)
33        break
34    elif comp == 2:
35        guess += jump
36    elif comp == 3:
37        guess -= jump
38
39    last_comp = comp

The program is now complete! Run it, and the computer should be able to guess your number after a few tries. You can see example output below in which the number was 20:

TEXT
1My guess is: 99
2Is this guess correct?
3    1: Yes!
4    2: The number is higher
5    3: The number is lower
6
7        : 3
8My guess is: 89.0
9Is this guess correct?
10    1: Yes!
11    2: The number is higher
12    3: The number is lower
13
14        : 3
15My guess is: 79.0
16Is this guess correct?
17    1: Yes!
18    2: The number is higher
19    3: The number is lower
20
21        : 3
22My guess is: 69.0
23Is this guess correct?
24    1: Yes!
25    2: The number is higher
26    3: The number is lower
27
28        : 3
29My guess is: 59.0
30Is this guess correct?
31    1: Yes!
32    2: The number is higher
33    3: The number is lower
34
35        : 3
36My guess is: 49.0
37Is this guess correct?
38    1: Yes!
39    2: The number is higher
40    3: The number is lower
41
42        : 3
43My guess is: 39.0
44Is this guess correct?
45    1: Yes!
46    2: The number is higher
47    3: The number is lower
48
49        : 3
50My guess is: 29.0
51Is this guess correct?
52    1: Yes!
53    2: The number is higher
54    3: The number is lower
55
56        : 3
57My guess is: 19.0
58Is this guess correct?
59    1: Yes!
60    2: The number is higher
61    3: The number is lower
62
63        : 2
64My guess is: 24.0
65Is this guess correct?
66    1: Yes!
67    2: The number is higher
68    3: The number is lower
69
70        : 3
71My guess is: 21.5
72Is this guess correct?
73    1: Yes!
74    2: The number is higher
75    3: The number is lower
76
77        : 3
78My guess is: 22.75
79Is this guess correct?
80    1: Yes!
81    2: The number is higher
82    3: The number is lower
83
84        : 3
85My guess is: 22.125
86Is this guess correct?
87    1: Yes!
88    2: The number is higher
89    3: The number is lower
90
91        : 3
92My guess is: 21.5
93Is this guess correct?
94    1: Yes!
95    2: The number is higher
96    3: The number is lower
97
98        : 3
99My guess is: 20.875
100Is this guess correct?
101    1: Yes!
102    2: The number is higher
103    3: The number is lower
104
105        : 3
106My guess is: 20.25
107Is this guess correct?
108    1: Yes!
109    2: The number is higher
110    3: The number is lower
111
112        : 3
113My guess is: 19.625
114Is this guess correct?
115    1: Yes!
116    2: The number is higher
117    3: The number is lower
118
119        : 2
120My guess is: 19.9375
121Is this guess correct?
122    1: Yes!
123    2: The number is higher
124    3: The number is lower
125
126        : 2
127My guess is: 20.25
128Is this guess correct?
129    1: Yes!
130    2: The number is higher
131    3: The number is lower
132
133        : 3
134My guess is: 20.09375
135Is this guess correct?
136    1: Yes!
137    2: The number is higher
138    3: The number is lower
139
140        : 1
141I did it!
142Total tries: 20

Understanding Machine Learning

You just built a complete machine learning algorithm! Machine learning programs keep an internal state of the program and update this state to get to a better, more accurate answer when you give the program data to learn from. Let's look at an analogy of this program using concepts from machine learning.

1. Internal State/Weight: The guess variable is the internal state that is updated when you provide new data for the program to learn. All machine learning algorithms must have internal states to maintain during the learning and prediction process. Some deep learning models even have gigabytes of weights to learn from data (we will learn more about weights soon).

2. Loss: Although we do not directly have a loss value or a loss function in our program, jump can be thought of as a loss value. The further we are from the answer, the bigger the loss. We can see that the loss value (jump) is decreasing slowly. This means that the algorithm is getting better at predicting the correct number. Every machine learning algorithm's main objective is to decrement/optimize the loss value as much as possible.

3. Learning Rate: The learning rate is the parameter that defines how fast the machine learning algorithm learns. The rate can't be too high or too low. In our program, decrement_jump can be thought of as the learning rate. If the learning rate is too high, then we will never converge and only oscillate around the minima. If the learning rate is too low, then it will take a long time for the algorithm to converge.

4. Optimizer/Optimizing Algorithm: At this point, we know that a machine learning algorithm has data, weights, loss, and a learning rate. Remember, the main objective of a machine learning algorithm is to optimize the loss function. To optimize the loss function, we need to update the weights somehow with the help of the loss value and the learning rate. This is what the optimizer does. In our program above, the incrementing and decrementing code inside the conditionals as well as the code that uses last_comp to change the jump direction is the optimizing algorithm. We will learn about different optimizers and gradient descent algorithms in another lesson.

5. Convergence Threshold: Our program must meet certain criteria to exit the while loop (i.e., when the computer has guessed the number or a value close enough to the number). This criterion is the convergence criteria. Most of the time, a value is preset such that when the loss becomes smaller than the preset value, then the convergence criteria are met and the machine learning algorithm stops learning. The value at which this occurs is known as the convergence threshold.

6. Learning Iteration: With every iteration of the while loop, we are providing data (to guess bigger or smaller) or a batch of data to the model. The model then calculates the loss with the given data and weights. Each time the algorithm updates its weights, then one learning iteration has passed.

7. Epoch: If an algorithm is provided a big dataset, then the algorithm has to go over the data several times to improve its learning capabilities. When all of the data is provided beforehand, this is known as offline learning. In offline learning, when the machine learning algorithm goes over all the data once, then one epoch has passed.

8. Accuracy: After training a model (like we did in the program above), we have to test the model to determine how effective the algorithm is. We can do this by determining the accuracy of the model. Accuracy can be calculated differently based on different objective functions. In our guessing game program, the accuracy is calculated by the equation: (answer-guess)/answer x 100%. Most of the time, accuracy is calculated with a percentage.

9. Activation Function: This function is not available in our guessing game program, but the job of the activation function is to post-process the output of machine learning algorithms to keep it in appropriate shape. Some machine learning algorithms try to make the output too big, too small, negative, or any undesired value. In those situations, activation functions help to keep the output within a boundary.

Are you sure you're getting this? Fill in the missing part by typing it in.

In machine learning, optimizers work to decrease the ____ function.

Write the missing line below.

The Data

Data is the most important component in machine learning since every machine learning algorithm's task is to learn that data. Machine learning algorithms are first learned by some data, and then the model is used to predict unknown data. In this sense, data is divided into two parts. Many online resources misinform readers about training, testing, and cross-validating data. Some sources say that a program validates the model with test data, but this is completely wrong. After this lesson, however, you will have a solid idea of how data is really divided and used in machine learning algorithms.

Let's look at an example of vehicle data where we attempt to estimate the price of a car based on the features of the car.

The Data

  • Training Data: The training data is the data that we know everything about. We know what the output should look like, so we can calculate the loss of our model from this data. We can also calculate the accuracy of the model from the training data. This is the data that is used in the training loop of a machine learning algorithm. In the example above, all the features along with the prices are the training data.

  • Test Data: This is the data that we want to understand using our machine learning algorithm. We need to predict the result of this unknown data with our model. Suppose a model learned from the given training data of vehicles. If you have a list of features for a new set of cars that you do not know the price of, then that data is the test data.

Thus, if you train using the whole training data, then you will not have any solid way to test the accuracy of the model. Maybe the model will work on seen data but will break on unseen data. You won't be able to get any accuracy from your test data since you do not know the price of vehicles in the test data. To solve this, the training data is divided into two parts.

  • Actual Train Data: This is the actual training data that is used in the training loop. In most cases, this is 80% or 75% of all the train data.
  • Cross-Validation Data: This is kept separate from the actual training data and is not used in the training loop. Thus, we say that after training, this cross-validation data is unseen to the model. We can then cross validate our model on both seen and unseen data from the train data. Most of the time, this data is 20% or 25% of all the training data.

Dataset attributes have classifications according to their usage:

  • Feature: These are the attributes or columns that the machine learning model will analyze and learn. In the vehicle dataset, all the columns except "price" are features.

  • Label: These are the attributes or columns that the machine learning model will try to predict. This label is used to calculate loss and accuracy. In the vehicle dataset, the column "price" is the label.

To illustrate the whole scenario with the vehicle dataset example, let's look at the image above. The prices in green are the label of the data while the rest of the columns are features. We will understand more about different types of features such as numeric and categorical features in another lesson.

Machine Learning Classes

There are several kinds of machine learning algorithms depending on the objective and provided data. We will discuss some of them below:

Supervised Learning & Unsupervised Learning

The supervised and unsupervised classes of learning are the most popular types of machine learning algorithms. If the provided data has a label on it, then it is supervised learning. If the process or algorithm does not need or have any label, then it is unsupervised learning.

The above model for the vehicle dataset is considered supervised learning if you want to predict the price or any other feature based on the rest of the attributes. The same data can be used for unsupervised learning if you want to cluster similar vehicles or detect outlier vehicles.

Usually, unsupervised learning is a little harder than supervised learning since it is difficult to determine the loss, or goodness, of a model when there is no label to compare to. For clustering a dataset, you can use the distance between the mean of two clusters or the inverse distance between points in similar clusters as a determiner of the "goodness" of the model. For now, we will learn many supervised learning methods. Later on in the series, however, we will go through unsupervised learning methods such as data clustering, outlier detection, data augmentation, synthesis, and many other applications.

Instance-Based Learning & Model-Based Learning

Machine learning can be divided into two categories based on the learning process and weights management.

Think of the machine learning program you implemented at the beginning of this lesson. Is the weight guess saved somewhere for future use? Can the model work better for a different case later when you guess a different number? No. This is called instance-based learning. In instance-based learning, the algorithm works only for the current instance of data. Instance-based learning is highly dependent on the data and does not work later if you do not provide data.

On the other hand, some machine learning algorithms can save their weights for later use. These kinds of algorithms, called model-based learning, can relearn and improve themselves if more data is provided. When doing predictions, these algorithms do not depend on the data. All deep learning algorithms are model-based learning processes.

Instance-based learning algorithms are usually faster than model-based learning. The Bayesian network model (or k- NN clustering algorithm), for example, is always faster than a deep learning algorithm when learning.

Offline Learning & Online Learning

Machine learning algorithms can be divided into two more different categories depending on the data. Machine learning algorithms can learn while the data is given live or they can learn after getting the complete dataset all at once.

In the above number guessing program, the data is given to the computer one by one. The data looked like the following at each step:

TEXT
1Step 1: 99 is bigger than guess
2Step 2: 89 is bigger than guess
3Step 3: 79 is bigger than guess
4Step 4: 69 is bigger than guess
5Step 5: 59 is bigger than guess
6...

This type of learning algorithm is known as an online learning process. The data is given live while the training loop is running.

However, if you were to try to program a dog-cat classification, then many images of dogs and cats would be given to the learning model all at once. The model runs for several epochs for all the images and then gets ready to predict a test image. This kind of learning is offline, or batch, learning since we are providing a batch of data all at once.

Let's test your knowledge. Click the correct answer from the options.

Which machine learning algorithm does not need labeled data?

Click the option that best answers the question.

  • Supervised ML
  • Unsupervised ML
  • Instance-based ML
  • Model-based ML
  • Offline Model
  • Online Model

Conclusion

This lesson is very important for your understanding of machine learning because it presented the fundamental of machine learning. In the next lesson, we will go through some of the core mathematics behind machine learning. We recommend that you go through the example algorithm multiple times and print different variables in the training loop in order to more thoroughly understand the whole process.

One Pager Cheat Sheet

  • By exploring NumPy and learning about various classes and keywords, this lesson will provide you with a fundamental introduction to machine learning.
  • We can use random to generate a guess, use an infinite loop to continually update it and track the direction of the jump and its size to eventually guess the user's number.
  • By understanding the internal state, loss, learning rate, optimizer, convergence threshold, learning iteration, epoch and accuracy, you can create a machine learning algorithm to optimize an objective function with the use of an activation function.
  • Optimizers are part of the machine learning algorithm which work to minimize the loss by updating the weights in order to make the loss function smaller and get more accurate predictions.
  • Data is divided into the categories of Training, Test, Feature, and Label, with the Actual Train Data and Cross-Validation Data being two subsets of the Training data which helps to test the accuracy of the model by exposing it to both seen and unseen data.
  • Machine Learning consists of two main classes: Supervised Learning, which uses labeled data, and Unsupervised Learning, which does not require labels and is more difficult to measure the "goodness" of a model.
  • Machine learning can be divided into two categories; instance-based learning, which does not save weights for later use, and model-based learning, which saves weights for future use and can be improved upon with additional data.
  • Machine learning algorithms can be divided into two categories: online learning, where data is given live while the training loop is running, and offline, or batch, learning, where data is given all at once.
  • Unsupervised ML is a type of Machine Learning algorithm that discovers patterns or similarities within data sets without the need for labeled data.
  • Understanding the fundamentals of machine learning is essential to succeeding with machine learning, and it is recommended to go through the example algorithm multiple times to gain a better understanding.