Siksha Sarovar

Siksha Sarovar (sikshasarovar.com) is a free educational web application that helps students in India learn programming and prepare for academic and competitive exams. The platform offers structured coding courses (C, C++, Python, Java, HTML, CSS, PHP, Power BI, AI, Machine Learning, Data Science), complete university curriculum notes for BCA/MCA students with previous year question papers, Class 10 and Class 12 CBSE/HBSE school notes, and dedicated preparation material for SSC, UPSC, Banking, Railway and other government exams. Browsing the site is completely free and requires no account. Users may optionally sign in with Google solely to save their learning progress, quiz scores and personal preferences across devices.

Privacy Policy | Terms of Service | Contact Siksha Sarovar | About Siksha Sarovar

v4.0.9 · PWA
Siksha Sarovar logo
Siksha Sarovar
Your Learning Universe

Siksha Sarovar is a free e-learning platform for coding courses, BCA university notes and competitive exam preparation. Optional Google sign-in saves your learning progress across devices.

Initializing knowledge base…
Compiling modules 0%

1.3 Analytics Process Model

Lesson 3 of 32 in the free Data Visualisation and Analytics notes on Siksha Sarovar, written by Rohit Jangra.

Analytics Process Model & Professional Roles

1. The Analytics Process Model

The analytics process is a structured, iterative approach to solving problems using data.

Study Deep: The "Cleaning" Bottleneck

In professional analytics, the most important phase is often the most tedious: Data Pre-processing.

  • The 80/20 Rule: Data professionals typically spend 80% of their time cleaning and organizing data, and only 20% performing actual analysis or modeling.
  • Why?: Real-world data is "dirty"—it has missing values, inconsistent formats, and human entry errors. A model trained on bad data produces bad results (GIGO - Garbage In, Garbage Out).

1. The Analytics Process Model

The analytics process is a structured, iterative approach to solving problems using data. While different frameworks exist (like CRISP-DM), most follow these 7 core stages:

StageGoalKey Activities
1. Business UnderstandingDefine the problemSet objectives, identify stakeholders, define success metrics
2. Data CollectionGather raw dataQuery databases, scraping, sensor logs, surveys
3. Data Pre-processingClean and organizeHandle missing values, remove duplicates, feature scaling
4. Exploratory Data Analysis (EDA)Understand patternsSummarize stats, visualize distributions, find correlations
5. Model BuildingPredict/ClassifySelect algorithms (Regression, K-Means), train the model
6. EvaluationTest accuracyCheck metrics (RMSE, Accuracy), tune hyperparameters
7. Decision MakingDeploymentPresent insights, build dashboards, automate decisions

2. Professional Roles in Analytics

The analytics pipeline requires different skills at different stages.

RolePrimary FocusTools
Data EngineerCollection & Storage (Stages 1-2)SQL, Hadoop, Spark, AWS/Azure
Data AnalystPre-processing & EDA (Stages 3-4)Excel, SQL, Tableau, PowerBI
Data ScientistModeling & Evaluation (Stages 5-6)Python, R, Statistics, Machine Learning
ML EngineerDeployment & Scaling (Stage 7)Docker, Kubernetes, MLOps

3. The Analytics Lifecycle (CRISP-DM)

Cross-Industry Standard Process for Data Mining (CRISP-DM) is the most widely used industry framework.

  • Key Feature: It is Non-Linear. For example, if evaluation (Stage 6) shows poor results, you might go back to Data Pre-processing (Stage 3) to improve the data quality.