Siksha Sarovar

Siksha Sarovar (sikshasarovar.com) is a free educational web application that helps students in India learn programming and prepare for academic and competitive exams. The platform offers structured coding courses (C, C++, Python, Java, HTML, CSS, PHP, Power BI, AI, Machine Learning, Data Science), complete university curriculum notes for BCA/MCA students with previous year question papers, Class 10 and Class 12 CBSE/HBSE school notes, and dedicated preparation material for SSC, UPSC, Banking, Railway and other government exams. Browsing the site is completely free and requires no account. Users may optionally sign in with Google solely to save their learning progress, quiz scores and personal preferences across devices.

Privacy Policy | Terms of Service | Contact Siksha Sarovar | About Siksha Sarovar

v4.0.9 · PWA
Siksha Sarovar logo
Siksha Sarovar
Your Learning Universe

Siksha Sarovar is a free e-learning platform for coding courses, BCA university notes and competitive exam preparation. Optional Google sign-in saves your learning progress across devices.

Initializing knowledge base…
Compiling modules 0%

9. Factor Analysis

Lesson 11 of 22 in the free Machine Learning II notes on Siksha Sarovar, written by Rohit Jangra.

9. Factor Analysis

Factor Analysis (FA) is a probabilistic generative model that explains observed variables as linear combinations of a small number of latent factors plus unique noise. Developed in psychometrics (Spearman, 1904), FA differs from PCA by explicitly modeling measurement noise.

Generative Model

x = W * z + mu + epsilon

Where:

  • x: observed p-dimensional variable
  • z: latent factor (k-dimensional, Gaussian: z ~ N(0, I))
  • W: factor loading matrix (p x k)
  • epsilon: unique noise (epsilon ~ N(0, Psi), Psi diagonal)
  • mu: mean of observed variable

FA vs PCA

AspectFactor AnalysisPCA
ModelProbabilistic generativeDeterministic projection
NoiseExplicitly modeled (Psi)Not explicitly modeled
UniquenessUnique solution up to rotationUnique (orthogonal components)
GoalExplain correlations via factorsMaximize explained variance
InterpretabilityFactors often interpretableComponents are orthogonal axes

Parameter Estimation

Parameters W and Psi are estimated by maximum likelihood using the EM algorithm:

  • E-step: Compute posterior of latent factors given observed data
  • M-step: Update W and Psi to maximize expected log-likelihood

The observed data covariance is modeled as: Sigma = W * W^T + Psi

Rotation Methods

Since FA is only identifiable up to orthogonal rotation, rotation methods improve interpretability:

  • Varimax: Maximizes variance of squared loadings (each factor loads heavily on few variables)
  • Oblimin: Allows correlated factors

Common Pitfalls

  • FA requires specifying k (number of factors) in advance — use scree plot or BIC
  • The EM algorithm may converge to local optima — use multiple random initializations
  • Heywood cases: negative unique variances indicate model misspecification

Exam-Ready Summary

  • Factor Analysis: generative model where observed = linear combination of latent factors + noise
  • Key difference from PCA: FA explicitly models unique noise per variable
  • FA assumes data covariance = W*W^T + Psi (low-rank + diagonal)
  • Parameters estimated by EM (maximum likelihood)
  • Rotation (varimax, oblimin) improves factor interpretability after fitting