Siksha Sarovar

Siksha Sarovar (sikshasarovar.com) is a free educational web application that helps students in India learn programming and prepare for academic and competitive exams. The platform offers structured coding courses (C, C++, Python, Java, HTML, CSS, PHP, Power BI, AI, Machine Learning, Data Science), complete university curriculum notes for BCA/MCA students with previous year question papers, Class 10 and Class 12 CBSE/HBSE school notes, and dedicated preparation material for SSC, UPSC, Banking, Railway and other government exams. Browsing the site is completely free and requires no account. Users may optionally sign in with Google solely to save their learning progress, quiz scores and personal preferences across devices.

Privacy Policy | Terms of Service | Contact Siksha Sarovar | About Siksha Sarovar

v4.0.9 · PWA
Siksha Sarovar logo
Siksha Sarovar
Your Learning Universe

Siksha Sarovar is a free e-learning platform for coding courses, BCA university notes and competitive exam preparation. Optional Google sign-in saves your learning progress across devices.

Initializing knowledge base…
Compiling modules 0%

Seaborn: Statistical Visualization

Lesson 28 of 37 in the free Data Science notes on Siksha Sarovar, written by Rohit Jangra.

Seaborn: Statistical Data Visualization

Definition: Seaborn is a Python visualization library built on top of Matplotlib that provides a high-level interface for creating attractive, informative statistical graphics. It is designed to work seamlessly with Pandas DataFrames and makes complex plots easy with minimal code.

import seaborn as sns

---

Matplotlib vs Seaborn

FeatureMatplotlibSeaborn
LevelLow-level (more code)High-level (less code)
AestheticsBasic (needs customization)Beautiful by default
Statistical IntegrationManualBuilt-in (regression, distributions)
DataFrame SupportBasicNative (pass column names directly)
Plot VarietyGeneral-purposeStatistical-focused
CustomizationExtremely flexibleModerate (uses Matplotlib underneath)
Use CaseCustom plots, subplotsQuick EDA, statistical analysis

---

Types of Seaborn Plots

Relational Plots (Relationships between variables)

PlotFunctionBest For
Scatter Plotsns.scatterplot(x, y, data=df)Relationship between two continuous variables
Line Plotsns.lineplot(x, y, data=df)Trends over time
Relplotsns.relplot(x, y, hue, data=df)Relational plot with facets

Distribution Plots

PlotFunctionBest For
Histogramsns.histplot(data, bins=30)Distribution shape
KDE Plotsns.kdeplot(data)Smooth density estimate
Dist Plotsns.displot(data, kde=True)Combined histogram + KDE
Box Plotsns.boxplot(x, y, data=df)Distribution summary with outliers
Violin Plotsns.violinplot(x, y, data=df)Distribution shape + density

Categorical Plots

PlotFunctionBest For
Bar Plotsns.barplot(x, y, data=df)Mean of a variable per category
Count Plotsns.countplot(x='col', data=df)Count of observations per category
Strip Plotsns.stripplot(x, y, data=df)Individual data points by category
Swarm Plotsns.swarmplot(x, y, data=df)Non-overlapping strip plot

Matrix Plots

PlotFunctionBest For
Heatmapsns.heatmap(corr_matrix, annot=True)Correlation matrix visualization
Cluster Mapsns.clustermap(data)Hierarchical clustering heatmap

Regression Plots

PlotFunctionBest For
Reg Plotsns.regplot(x, y, data=df)Scatter + regression line
LM Plotsns.lmplot(x, y, hue, data=df)Regression with faceting

---

Key Seaborn Features

1. Hue (Color Grouping)

Add a third variable using color: sns.scatterplot(x='Age', y='Score', hue='Gender', data=df)

2. Built-in Themes

sns.set_theme(style="darkgrid")   # darkgrid, whitegrid, dark, white, ticks
sns.set_palette("pastel")         # Color palette

3. Color Palettes

PaletteTypeBest For
"deep"QualitativeDefault, distinct categories
"pastel"QualitativeSoft, presentation-friendly
"coolwarm"DivergingCorrelation heatmaps
"viridis"SequentialOrdered data
"Set2"QualitativeColorblind-friendly

---

Correlation Heatmap (Most Common in EDA)

corr = df.corr()   # Compute correlation matrix
sns.heatmap(corr, annot=True, cmap='coolwarm', vmin=-1, vmax=1)
plt.title("Feature Correlation Heatmap")
plt.show()

This is one of the most important visualizations in Exploratory Data Analysis. It shows which features are positively or negatively correlated.

---

Seaborn in the Data Science Workflow

StageHow Seaborn Helps
EDAQuickly visualize distributions, correlations, outliers
Feature SelectionHeatmaps reveal correlated features
Model EvaluationPlot predicted vs actual values
PresentationBeautiful plots for non-technical stakeholders

Summary

  • Seaborn builds on Matplotlib to provide beautiful statistical visualizations.
  • It integrates natively with Pandas DataFrames.
  • Key plots: scatter, box, violin, heatmap, pairplot, and regression plots.
  • hue parameter adds a third categorical dimension using color.
  • Correlation heatmaps are essential for feature selection in machine learning.