Siksha Sarovar

Siksha Sarovar (sikshasarovar.com) is a free educational web application that helps students in India learn programming and prepare for academic and competitive exams. The platform offers structured coding courses (C, C++, Python, Java, HTML, CSS, PHP, Power BI, AI, Machine Learning, Data Science), complete university curriculum notes for BCA/MCA students with previous year question papers, Class 10 and Class 12 CBSE/HBSE school notes, and dedicated preparation material for SSC, UPSC, Banking, Railway and other government exams. Browsing the site is completely free and requires no account. Users may optionally sign in with Google solely to save their learning progress, quiz scores and personal preferences across devices.

Privacy Policy | Terms of Service | Contact Siksha Sarovar | About Siksha Sarovar

v4.0.9 · PWA
Siksha Sarovar logo
Siksha Sarovar
Your Learning Universe

Siksha Sarovar is a free e-learning platform for coding courses, BCA university notes and competitive exam preparation. Optional Google sign-in saves your learning progress across devices.

Initializing knowledge base…
Compiling modules 0%

8. Kernel PCA

Lesson 10 of 22 in the free Machine Learning II notes on Siksha Sarovar, written by Rohit Jangra.

8. Kernel PCA

Kernel PCA extends PCA to non-linearly separable data by first mapping inputs to a high-dimensional feature space using a kernel function, then applying PCA in that space — without ever explicitly computing the mapping (the kernel trick).

The Kernel Trick

The kernel function computes inner products in feature space implicitly: K(x_i, x_j) = phi(x_i)^T * phi(x_j)

This allows us to work in possibly infinite-dimensional feature spaces at the cost of only n x n kernel matrix computations.

Common Kernels

KernelFormulaUse Case
Linearx^T * ySame as standard PCA
Polynomial(x^T * y + c)^dPolynomial features
RBF (Gaussian)exp(-x-y^2 / (2*sigma^2))General non-linear
Sigmoidtanh(alpha x^T y + c)Neural network-like

Algorithm

  1. Compute centered kernel matrix K (n x n), K_ij = K(x_i, x_j)
  2. Center in feature space: K_c = K - 1_nK - K1_n + 1_nK1_n
  3. Eigendecompose: K_c alpha = n lambda * alpha
  4. Normalize eigenvectors: alpha_i <- alpha_i / sqrt(lambda_i)
  5. Project new point: z_k(x) = sum_i alpha_{i,k} * K(x_i, x)

Bandwidth Selection for RBF

The sigma hyperparameter (bandwidth) critically determines the feature space:

  • Large sigma: nearly linear (features barely separated)
  • Small sigma: every point is its own cluster (overfitting)
  • Rule of thumb: sigma = median pairwise distance / sqrt(2)

Common Pitfalls

  • Kernel matrix is n x n — expensive for large datasets (O(n^2) memory, O(n^3) time)
  • Choosing the right kernel and its hyperparameters requires cross-validation
  • Kernel PCA components are not easily interpretable in original feature space

Exam-Ready Summary

  • Kernel PCA: apply kernel trick to compute inner products in feature space, then run PCA
  • Requires only the kernel matrix K — never explicitly computes phi(x)
  • RBF kernel is the most popular and handles general non-linear structure
  • Memory: O(n^2) for kernel matrix — not scalable to very large datasets
  • The sigma (bandwidth) hyperparameter controls the smoothness of the kernel feature space