Siksha Sarovar

Siksha Sarovar (sikshasarovar.com) is a free educational web application that helps students in India learn programming and prepare for academic and competitive exams. The platform offers structured coding courses (C, C++, Python, Java, HTML, CSS, PHP, Power BI, AI, Machine Learning, Data Science), complete university curriculum notes for BCA/MCA students with previous year question papers, Class 10 and Class 12 CBSE/HBSE school notes, and dedicated preparation material for SSC, UPSC, Banking, Railway and other government exams. Browsing the site is completely free and requires no account. Users may optionally sign in with Google solely to save their learning progress, quiz scores and personal preferences across devices.

Privacy Policy | Terms of Service | Contact Siksha Sarovar | About Siksha Sarovar

v4.0.9 · PWA
Siksha Sarovar logo
Siksha Sarovar
Your Learning Universe

Siksha Sarovar is a free e-learning platform for coding courses, BCA university notes and competitive exam preparation. Optional Google sign-in saves your learning progress across devices.

Initializing knowledge base…
Compiling modules 0%

9. K-Nearest Neighbors (KNN)

Lesson 9 of 21 in the free Machine Learning notes on Siksha Sarovar, written by Rohit Jangra.

What is KNN?

K-Nearest Neighbors is a simple, easy-to-understand, versatile, and strong machine learning algorithm. It is used for both classification and regression problems. "Show me who your friends are, and I'll tell you who you are."

How KNN Works

  1. Select K: Choose the number of neighbors (K).
  2. Calculate Distance: Find the distance between the query point and all other points in the dataset (Euclidean, Manhattan).
  3. Find Neighbors: Identify the K nearest neighbors.
  4. Vote (Classification): Assign the class that is most common among the neighbors.
  5. Average (Regression): Assign the average value of the neighbors.

Choosing K

  • Small K: Sensitive to noise/outliers (Overfitting).
  • Large K: Smoother decision boundary, but might miss local patterns (Underfitting).
  • Rule of Thumb: K = sqrt(N), where N is the number of samples.

Pros & Cons

  • Pros: Simple, No training phase (Lazy Learner).
  • Cons: Computationally expensive for large datasets (needs to calculate distance to every point), Sensitive to scale (requires feature scaling).