[6762 views]

There can be various methods of solving the same problem with the help of algorithms. It depends on different factors like time complexity, space complexity and so on. K-nearest Neighbors is another algorithm that belongs to the supervised machine learning algorithm category. Letâ€™s find out for what is used for.

It is a non-parametric method and a machine learning algorithm that is useful in solving predictive problems related to classification as well as regression. It is used especially in pattern recognition. Moreover, it is known as a lazy learning algorithm because of the absence of specialized training phase use of data for training classification. This algorithm has applications in the banking system, politics, and computing of credit rating.

Lazy learning is also called instance based learning which is based on the dataset memorization. This algorithm is best known for its simplicity and easy to learn features. It is worth to know that the kNN algorithm make the use of local neighborhood for obtaining a prediction.

You will need a distance function for the comparison of examples similarity. Some of the popular distance measures used in kNN are- Euclidean distance, Manhattan distance, Hamming distance, Minkowski distance, cosine and so on. Euclidean distance is the most used among them. It is applied where input variables are similar.

In kNN, k represents the total numbers of nearest neighbors used for classification or prediction of a test sample. The process of choosing the right value of k is known as parameter tuning.

- Choose a value for K
- Take the K nearest neighbor of the new data point as per the Euclidean distance
- Begin counting the number of data points in all the given categories and provide a new data point to that category where you find most numbers of neighbors.

- Download the iris dataset or any other that you want
- Provide column names to the dataset
- Read this dataset to the panda deframe
- The data will be processed with the following code:
- The next step is to divide the dataset into train and test split
- Data will be scaled as:
- Perform training of the model with KNeighborsClassifier class of sklearn
- Lastly, make prediction
- Print result by using:

- We do the same here by importing python packages
- Download Iris dataset
- Add column names
- Read the dataset to the panda dataframe
- Import KNeighborsRegressor from sklearn
- Lastly, find MSE with the following code and get the output by running it

I hope that you have understood the ways of using and implementing K-nearest neighbor algorithm.