Siksha Sarovar

Siksha Sarovar (sikshasarovar.com) is a free educational web application that helps students in India learn programming and prepare for academic and competitive exams. The platform offers structured coding courses (C, C++, Python, Java, HTML, CSS, PHP, Power BI, AI, Machine Learning, Data Science), complete university curriculum notes for BCA/MCA students with previous year question papers, Class 10 and Class 12 CBSE/HBSE school notes, and dedicated preparation material for SSC, UPSC, Banking, Railway and other government exams. Browsing the site is completely free and requires no account. Users may optionally sign in with Google solely to save their learning progress, quiz scores and personal preferences across devices.

Privacy Policy | Terms of Service | Contact Siksha Sarovar | About Siksha Sarovar

v4.0.9 · PWA
Siksha Sarovar logo
Siksha Sarovar
Your Learning Universe

Siksha Sarovar is a free e-learning platform for coding courses, BCA university notes and competitive exam preparation. Optional Google sign-in saves your learning progress across devices.

Initializing knowledge base…
Compiling modules 0%

Data Visualisation and Analytics — Free Notes & Tutorial

Free DVA (Data Visualization & Analytics) notes for BCA with PYQ papers at SikshaSarovar.

This Data Visualisation and Analytics course is part of Siksha Sarovar and is 100% free for students in India — no sign-up required to read. It contains 32 structured lessons with examples, and pairs with our free online compiler and AI tutor.

What you will learn

  • Data visualization
  • Analytics
  • Charts
  • Dashboards

Course content (32 lessons)

  1. 1.1 Unit 1 Overview — Unit 1: Overview of Data Visualisation and Analytics This unit introduces the fundamentals of data visualization, its importance, and various techniques to represent data…
  2. 1.2 Analytics Fundamentals — Analytics: Basic Nomenclature 1. What is Analytics? Analytics is the systematic, computational process of collecting, cleaning, analyzing, and interpreting data to discover useful…
  3. 1.3 Analytics Process Model — Analytics Process Model & Professional Roles 1. The Analytics Process Model The analytics process is a structured, iterative approach to solving problems using data. Study Deep:…
  4. 1.4 Analytical Models — Analytical Models: Requirements & Types 1. What is an Analytical Model? An analytical model is a mathematical or statistical representation of a real-world process. Study Deep:…
  5. 1.5 Data Collection & Sampling — Data Collection, Sampling & Distributions 1. Data Collection Sources Data collection is the foundation. The quality and type of data collected determines everything downstream.…
  6. 1.6 Data Quality & Outliers — Data Quality: Missing Values & Outliers 1. Missing Values Handling missing data is critical because most statistical methods and machine learning algorithms cannot process…
  7. 1.7 Standardization — Standardization (Feature Scaling) 1. Why Scale Data? Variables often have different units and magnitudes. Scaling puts all features on a level playing field. Study Deep: When…
  8. 1.8 Categorization & Segmentation — Categorization vs. Segmentation 1. Categorization Categorization is the process of assigning data points into predefined, manually specified groups based on explicit rules. Study…
  9. 2.1 Unit 2 Overview — Unit 2: Statistical Methods & Hypothesis Testing This unit covers the fundamental statistical techniques required for data analysis, tailored for BCA and computer science…
  10. 2.2 Probability Distributions in Depth — Probability Distributions: The Theory of Data Patterns 1. Mathematical Foundation A probability distribution is a mathematical function that describes the likelihood of obtaining…
  11. 2.3 Advanced Sampling Theory — Sampling Theory: Bridging Sample and Population 1. The Core Objective In data analytics, we rarely have access to the entire Population (N). Instead, we take a Sample (n) to…
  12. 2.4 Rigorous Hypothesis Testing — Hypothesis Testing: The Decision Framework 1. The Philosophical Framework Hypothesis testing operates like a criminal trial: "Innocent until proven guilty." Study Deep:…
  13. 2.5 Parametric Tests: Deep Dive — Parametric Tests: Z and T Distributions 1. What makes a test "Parametric"? Parametric tests assume that the underlying population follows a specific probability distribution…
  14. 2.6 The Mathematics of p-Values — p-Values: Evidence and Interpretation 1. Mathematical Definition of a p-Value The p-value is the exact probability of obtaining a test statistic at least as extreme as the one…
  15. 2.7 Confidence Intervals & Precision — Confidence Intervals: Quantifying Uncertainty 1. Point Estimates vs. Interval Estimates - Point Estimate: A single number calculated from a sample (e.g., sample mean x̄ = 45 ). It…
  16. 2.8 Non-Parametric: Chi-Square Test — Chi-Square (χ²): Analyzing Categorical Data 1. Parametric vs. Non-Parametric When data violates normal distribution assumptions, or when dealing with nominal/ordinal categorical…
  17. 2.9 Correlation & Linear Regression — Correlation & Regression: Modeling Relationships 1. Pearson Correlation Coefficient (r) Quantifies the linear relationship between two continuous variables. Study Deep: Adjusted…
  18. 2.10 Analysis of Variance (ANOVA) — ANOVA: Comparing Multiple Groups Rigorously 1. The Problem with Multiple T-Tests If testing 3 algorithms (A, B, C), running 3 T-tests results in Family-wise Error Rate Inflation.…
  19. 2.11 Statistical Paradoxes in Analytics — Statistical Paradoxes: When Math Defies Logic 1. Simpson’s Paradox A phenomenon where a trend appears in isolated subgroups of data, but disappears or reverses when the groups are…
  20. 3.1 Unit 3 Overview — Unit 3: Data Visualization with Python This unit focuses on the practical application of visualization libraries. You will learn to create static, animated, and interactive plots…
  21. 3.2 Matplotlib Basics — Data Visualization with Matplotlib: The Foundation 1. Introduction to Architecture Matplotlib is the foundational library for visualization in Python. Study Deep: Tufte's Data-Ink…
  22. 3.3 Advanced Matplotlib — Advanced Matplotlib: Styling, Subplots & Layouts 1. Subplots: One Figure, Multiple Graphs Often you want to compare different views side-by-side. We use plt.subplots() . Parameter…
  23. 3.4 Seaborn: Interface & Distributions — Study Deep: Kernel Density Estimation (KDE) A Histogram is sensitive to "bin size"—change the bin width, and the shape changes. The Solution: KDE . It smooths the data using a…
  24. 3.5 Seaborn: Categorical & Styling — Seaborn: Categorical Data & Aesthetics 1. Visualizing Categorical Data ( catplot ) When one variable is a category (e.g., "Day of Week") and the other is numerical (e.g., "Total…
  25. 4.1 Unit 4 Overview — Unit 4: GUI Programming & Database Access This final unit bridges the gap between analysis and application. You will learn to build user-friendly interfaces using Tkinter and…
  26. 4.2 GUI Programming with Tkinter — GUI Programming: Creating User Interfaces with Tkinter 1. Introduction to GUI (Graphical User Interface) A GUI allows users to interact with a program using visual elements like…
  27. 4.3 Advanced GUI Widgets — Advanced Tkinter: Selection, Menus, and Dialogs 1. Tkinter Variable Types Tkinter uses special variable classes to track widget state. They automatically update the GUI when…
  28. 4.4 Database Connectivity & SQL — Database Access in Python: The DB-API 1. Introduction to DB-API Python provides a standard interface called DB-API 2.0 (PEP 249) for interacting with databases. This means the…
  29. 4.5 CRUD Operations — Implementing CRUD in Python 1. Creating a Table Column Constraints: Constraint Meaning Example :--- :--- :--- PRIMARY KEY Unique identifier for each row id INTEGER PRIMARY KEY…
  30. MID Term Important Questions — MID Term Important Questions Section A – Short Answer Questions 1. Define Data Analytics. 2. What do you mean by Basic Nomenclature in Analytics? 3. Explain the Analytics Process…
  31. PYQ: End Term June 2024
  32. PYQ: End Term May/June 2025

1.1 Unit 1 Overview

Unit 1: Overview of Data Visualisation and Analytics

This unit introduces the fundamentals of data visualization, its importance, and various techniques to represent data effectively. We will explore how raw data is transformed into meaningful insights through a systematic process. By the end of this unit, you will understand the full analytics pipeline — from data collection to decision-making — and the mathematical tools used at each stage.

Topics Covered in This Unit

#TopicDescriptionKey Concepts
1.2Analytics FundamentalsCore terminology, types of data, and the four analytics typesStructured vs. Unstructured Data, Descriptive to Prescriptive Analytics
1.3Analytics Process ModelStep-by-step data analysis pipeline and professional rolesCRISP-DM, Data Engineer vs. Data Scientist
1.4Analytical ModelsMathematical models for prediction and classificationClassification, Regression, Clustering, Time-Series
1.5Data Collection & SamplingHow to gather representative dataProbability vs. Non-Probability Sampling, Central Limit Theorem
1.6Data Quality & OutliersHandling imperfect dataMCAR/MNAR Missingness, Z-Score & IQR Outlier Detection
1.7StandardizationScaling features for fair comparisonMin-Max Normalization, Z-Score Standardization, Robust Scaling
1.8Categorization & SegmentationGrouping data — rule-based vs. data-drivenK-Means Clustering, Silhouette Score

Visual Overview

The visual overview below summarizes the key concepts covered in this unit, including the analytics process, types of data, and the role of visualization in decision-making.

(Refer to the image below for a structural breakdown)

1.2 Analytics Fundamentals

Analytics: Basic Nomenclature

1. What is Analytics?

Analytics is the systematic, computational process of collecting, cleaning, analyzing, and interpreting data to discover useful patterns, trends, and insights that help in decision-making. It transforms raw data into actionable intelligence using statistical methods, algorithms, and domain knowledge.

Study Deep: The DIKW Pyramid Logic

The DIKW Pyramid (Data, Information, Knowledge, Wisdom) represents the structural hierarchy of how we process raw facts into strategic decisions.

  1. Data: The raw, atomic facts (e.g., "102").
  2. Information: Data with context (e.g., "102 is the temperature in Fahrenheit").
  3. Knowledge: Information with experience (e.g., "102°F means the patient has a high fever").
  4. Wisdom: Knowledge with judgment (e.g., "Administer paracetamol and monitor the patient").

2. Data vs. Information vs. Knowledge

Understanding this hierarchy is fundamental:

ConceptDefinitionExampleCharacteristics
DataRaw, unprocessed facts and figures without context45, "Red", 12-07-2025Objective, unorganized, meaningless alone
InformationData that has been processed, organized, and given context"The red car was sold on 12-07-2025 for $45,000"Contextual, organized, answers Who/What/When
KnowledgeInformation combined with experience and judgment"Red cars sell 20% faster in summer; stock more for Q2"Actionable, experience-driven, answers How/Why
WisdomApplying knowledge ethically and strategically"We should focus marketing on red cars in spring to maximize summer sales"Strategic, forward-looking, answers "What's best?"

This hierarchy is known as the DIKW Pyramid (Data → Information → Knowledge → Wisdom).

3. Types of Data

Data can be classified along multiple dimensions. The two foundational categories are:

FeatureStructured DataUnstructured DataSemi-Structured Data
FormatHighly organized, fixed schemaNo predefined formatPartially organized (tags/markers)
StorageRelational Databases (SQL), SpreadsheetsData Lakes, NoSQL, File SystemsJSON, XML, Email (header + body)
ExamplesStudent records, bank transactions, inventoryEmails, social media posts, videos, imagesJSON API responses, HTML pages, log files
Ease of AnalysisEasy — direct queries with SQLDifficult — requires NLP, Computer VisionModerate — requires parsing
% of All Data~20%~80%Varies

Data can also be classified by measurement scale:

  • Nominal: Categories without order (e.g., Color: Red, Blue, Green).
  • Ordinal: Categories with a meaningful order but unequal intervals (e.g., Rating: Low, Medium, High).
  • Interval: Numeric with equal intervals but no true zero (e.g., Temperature in °C: 0°C ≠ "no heat").
  • Ratio: Numeric with equal intervals AND a true zero (e.g., Weight: 0 kg = no weight).

4. The Four Types of Analytics

Analytics is categorized into four types, progressing in both complexity and business value:

TypeCore QuestionTechniquesExampleValue Level
DescriptiveWhat happened?Averages, percentages, dashboards, charts"Sales dropped by 10% last month"Low (Hindsight)
DiagnosticWhy did it happen?Drill-down, data discovery, correlations, root cause analysis"Sales dropped because a competitor launched a cheaper product"Medium (Insight)
PredictiveWhat is likely to happen?Regression, forecasting, ML models, time-series analysis"Sales are likely to drop another 5% next month"High (Foresight)
PrescriptiveWhat should we do?Optimization, simulation, decision trees, A/B testing"Lower prices by 15% to regain market share"Very High (Action)
Analytics Maturity Model: Most organizations start at Descriptive and progressively adopt more advanced types. Only ~3% of enterprises fully leverage Prescriptive Analytics.

5. Key Terms Glossary

TermDefinitionExample
DatasetA collection of related data organized in rows and columnsA table of student marks
Variable (Feature)A characteristic that can vary across observationsAge, Height, Income
Observation (Record)A single row in a dataset representing one entityOne student's complete data
InsightA valuable, actionable conclusion drawn from analysis"Customers buy more on weekends"
KPI (Key Performance Indicator)A measurable value that shows progress toward a goalMonthly Revenue, Customer Churn Rate
MetricA quantifiable measure used to track performanceAverage Order Value, Click-Through Rate
DimensionA categorical attribute used to slice dataRegion, Product Category, Time Period

Frequently asked questions

Is the Data Visualisation and Analytics course really free?

Yes. The entire Data Visualisation and Analytics course on Siksha Sarovar is free to read with no account required. You can optionally sign in with Google to save your progress.

Do I get a certificate for Data Visualisation and Analytics?

Yes — finish the lessons and pass the quiz to earn a free, verifiable certificate you can share on LinkedIn or with recruiters.

Can I run code while learning?

Yes. The built-in online compiler runs C, C++, Python, Java, PHP, JavaScript, C# and SQL directly in your browser — no installation needed.