#native_company# #native_desc#
#native_cta#

What is a KNN classifier

Let's get physical

Sometimes, nothing beats holding a copy of a book in your hands. Writing in the margins, highlighting sentences, folding corners. So this book is also available from Amazon as a paperback.

Buy now on Amazon

Throughout this course, I’ve been focussing on Neural Networks, but there are a whole host of other Machine Learning algorithms you can use. Neural Networks are an excellent general-purpose tool, but there are many others. To go beyond this introductory material, you will need to dedicate a fair amount of time to build your tool chest, just like any master craftsman. One other tool I will explain here is what’s called a KNN classifier, or a K-nearest neighbor classifier.

They are relatively easy to understand however their predictive power on their own is limited, in combination with a neural network, though as part of a transfer learning model, they can be handy and effortless to use.

How does it work?

Given a set of labeled data points, we can predict the label of a new data point by just looking at the other examples that are the closest to it.

For example, imagine we have this clustering of dots, some are labeled as solid, and some are labeled as hollow. If there was a new dot at the star’s position, what is the likely label of that new dot?

6.transfer learning.001
Figure 1. Which label is the black star, solid or hollow?

We could figure out the distance between the new dot and all the other dots and then copy the label of the closest dot, but in this case, it looks like it might get mislabelled because although it’s close to a solid dot, it feels like it’s part of the hollow dot cluster.

Maybe a better solution is to figure out the most common label from say the closest 5 dots, or the closest 10 dots.

6.transfer learning.002
Figure 2. Pick the most common label from the closest 5 dots

The K-factor

The number of other examples to compare with that is the K number in KNN classification.

There is usually an optimum value for K for each data set; you can figure it out by looking at a labeled data set, splitting it into a training and validation data set. Then use increasing values of K and see how well they predict the validation data set.

You may end up with a result like this, which indicates that a value of 7 is suitable for this data set.

6.transfer learning.003
Figure 3. Example of the validation error with different values of K

Choose the K value with the lowest error.

Calculating distance between data points

In our example above, we use a simple 2D graph with data points. It’s easy for the mind to see the distance between two data points in the 2D world. What about 3D, or 5D or 1000D? It gets harder to visualize, but the algorithm is just the same: Euclidian distance[1].

Another way to look at the data in our dot examples above is a data set with two features: an x value and a y value.

Suppose we wanted to use KNN for other types of problems the data points would need to be able to be mapped onto points in a Euclidean space. That means the features of the points need to be numbers.

One example of a problem you can use KNN with is the Iris Flowers dataset[2]. This data set involves predicting the specific flower species of the genus Iris from different measurements of Iris flowers.

The data set has 150 data points, where each data point has:

  1. Sepal length in cm.

  2. Sepal width in cm.

  3. Petal length in cm.

  4. Petal width in cm.

  5. Class

So each data point has 4 features and 1 class. Each of the features is a number (cm), so it can be mapped into Euclidean space, and you can calculate the distance between and data point and any other in this Euclidean space.

Summary

This was a quick lecture to cover the concept of the KNN classifier. They are simple machine learning models that are simple to understand, simple to implement; however, their predictive power is limited. However, used in conjunction with a neural network in a transfer learning model, they can become much more powerful. We will be using a KNN classifier as the new machine learning model to add on top of a decapitated neural network in the next lecture.



Advanced JavaScript

This unique course teaches you advanced JavaScript knowledge through a series of interview questions. Bring your JavaScript to the 2021's today.

Level up your JavaScript now!