K nearest neighbor, or you can call it knn is an algorithm for grouping labeled data based on the similarity of distance between the data points. This algorithm includes a simple algorithm because the calculation involves only two stages of calculating the distance of each data with all data points and then looking for the minimum distance to determine the label data being tested (test data).
What are data in knn ?
Test data is data that has been prepared for testing knn algorithm. This data is not involved in the training or may be referred to as unknown data. This type of data was never used before in the knn algorithm. The amount of data is usually less than the amount of data used in the knn algorithm training.
The data involved in the training process of the knn algorithm is called training data. Usually the number is much higher than the test data. The knn algorithm to treat this data is uniquely different from other machine learning algorithms, especially algorithms for the purpose of data classification. . The collection of training data is not counted in the training process or in other words the knn has no training process but the training data is stored in a vector space to serve as vector data knn.
Why knn is called lazy algorithm?
The knn algorithm is well known as
the lazy algorithm, since the knn algorithm uses training data on the process
of determining labels of test data. Unlike other machine learning algorithms.
They use training data to build their algorithm model.
Tidak ada komentar:
Posting Komentar