Siksha Sarovar

Siksha Sarovar (sikshasarovar.com) is a free educational web application that helps students in India learn programming and prepare for academic and competitive exams. The platform offers structured coding courses (C, C++, Python, Java, HTML, CSS, PHP, Power BI, AI, Machine Learning, Data Science), complete university curriculum notes for BCA/MCA students with previous year question papers, Class 10 and Class 12 CBSE/HBSE school notes, and dedicated preparation material for SSC, UPSC, Banking, Railway and other government exams. Browsing the site is completely free and requires no account. Users may optionally sign in with Google solely to save their learning progress, quiz scores and personal preferences across devices.

Privacy Policy | Terms of Service | Contact Siksha Sarovar | About Siksha Sarovar

v4.0.9 · PWA
Siksha Sarovar logo
Siksha Sarovar
Your Learning Universe

Siksha Sarovar is a free e-learning platform for coding courses, BCA university notes and competitive exam preparation. Optional Google sign-in saves your learning progress across devices.

Initializing knowledge base…
Compiling modules 0%

2. Introduction to Data Science

Lesson 2 of 21 in the free Machine Learning notes on Siksha Sarovar, written by Rohit Jangra.

What is Data Science?

Data Science is an interdisciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from noisy, structured, and unstructured data. It unifies statistics, data analysis, informatics, and their related methods in order to analyze actual phenomena with data.

Data Science vs Machine Learning

While often used interchangeably, they are distinct:

  • Data Science is the broad umbrella that covers the entire data lifecycle (collection, cleaning, analysis, visualization, and modeling).
  • Machine Learning is a tool or subset of data science focused specifically on building predictive models.

Applications of Data Science

Data Science is transforming every industry:

IndustryApplicationExample
HealthcareDisease PredictionPredicting diabetes risk based on patient history.
E-CommerceRecommendation Systems"Customers who bought this also bought..." (Amazon).
FinanceFraud DetectionIdentifying unusual credit card transactions in real-time.
LogisticsRoute OptimizationOptimizing delivery routes to save fuel and time (UPS/FedEx).
EntertainmentContent PersonalizationNetflix maximizing watch time by suggesting relevant shows.

The Data Science Lifecycle

  1. Capture: Data Acquisition, Data Entry, Signal Reception.
  2. Maintain: Data Warehousing, Data Cleansing, Staging, Processing.
  3. Process: Data Mining, Clustering/Classification, Data Modeling.
  4. Analyze: Exploratory/Confirmatory, Predictive Analysis, Regression, Text Mining.
  5. Communicate: Data Reporting, Data Visualization, Business Intelligence.