Analytics: Basic Nomenclature
1. What is Analytics?
Analytics is the systematic, computational process of collecting, cleaning, analyzing, and interpreting data to discover useful patterns, trends, and insights that help in decision-making. It transforms raw data into actionable intelligence using statistical methods, algorithms, and domain knowledge.
Study Deep: The DIKW Pyramid Logic
The DIKW Pyramid (Data, Information, Knowledge, Wisdom) represents the structural hierarchy of how we process raw facts into strategic decisions.
- Data: The raw, atomic facts (e.g., "102").
- Information: Data with context (e.g., "102 is the temperature in Fahrenheit").
- Knowledge: Information with experience (e.g., "102°F means the patient has a high fever").
- Wisdom: Knowledge with judgment (e.g., "Administer paracetamol and monitor the patient").
2. Data vs. Information vs. Knowledge
Understanding this hierarchy is fundamental:
| Concept | Definition | Example | Characteristics |
|---|---|---|---|
| Data | Raw, unprocessed facts and figures without context | 45, "Red", 12-07-2025 | Objective, unorganized, meaningless alone |
| Information | Data that has been processed, organized, and given context | "The red car was sold on 12-07-2025 for $45,000" | Contextual, organized, answers Who/What/When |
| Knowledge | Information combined with experience and judgment | "Red cars sell 20% faster in summer; stock more for Q2" | Actionable, experience-driven, answers How/Why |
| Wisdom | Applying knowledge ethically and strategically | "We should focus marketing on red cars in spring to maximize summer sales" | Strategic, forward-looking, answers "What's best?" |
This hierarchy is known as the DIKW Pyramid (Data → Information → Knowledge → Wisdom).
3. Types of Data
Data can be classified along multiple dimensions. The two foundational categories are:
| Feature | Structured Data | Unstructured Data | Semi-Structured Data |
|---|---|---|---|
| Format | Highly organized, fixed schema | No predefined format | Partially organized (tags/markers) |
| Storage | Relational Databases (SQL), Spreadsheets | Data Lakes, NoSQL, File Systems | JSON, XML, Email (header + body) |
| Examples | Student records, bank transactions, inventory | Emails, social media posts, videos, images | JSON API responses, HTML pages, log files |
| Ease of Analysis | Easy — direct queries with SQL | Difficult — requires NLP, Computer Vision | Moderate — requires parsing |
| % of All Data | ~20% | ~80% | Varies |
Data can also be classified by measurement scale:
- Nominal: Categories without order (e.g., Color: Red, Blue, Green).
- Ordinal: Categories with a meaningful order but unequal intervals (e.g., Rating: Low, Medium, High).
- Interval: Numeric with equal intervals but no true zero (e.g., Temperature in °C: 0°C ≠ "no heat").
- Ratio: Numeric with equal intervals AND a true zero (e.g., Weight: 0 kg = no weight).
4. The Four Types of Analytics
Analytics is categorized into four types, progressing in both complexity and business value:
| Type | Core Question | Techniques | Example | Value Level |
|---|---|---|---|---|
| Descriptive | What happened? | Averages, percentages, dashboards, charts | "Sales dropped by 10% last month" | Low (Hindsight) |
| Diagnostic | Why did it happen? | Drill-down, data discovery, correlations, root cause analysis | "Sales dropped because a competitor launched a cheaper product" | Medium (Insight) |
| Predictive | What is likely to happen? | Regression, forecasting, ML models, time-series analysis | "Sales are likely to drop another 5% next month" | High (Foresight) |
| Prescriptive | What should we do? | Optimization, simulation, decision trees, A/B testing | "Lower prices by 15% to regain market share" | Very High (Action) |
Analytics Maturity Model: Most organizations start at Descriptive and progressively adopt more advanced types. Only ~3% of enterprises fully leverage Prescriptive Analytics.
5. Key Terms Glossary
| Term | Definition | Example |
|---|---|---|
| Dataset | A collection of related data organized in rows and columns | A table of student marks |
| Variable (Feature) | A characteristic that can vary across observations | Age, Height, Income |
| Observation (Record) | A single row in a dataset representing one entity | One student's complete data |
| Insight | A valuable, actionable conclusion drawn from analysis | "Customers buy more on weekends" |
| KPI (Key Performance Indicator) | A measurable value that shows progress toward a goal | Monthly Revenue, Customer Churn Rate |
| Metric | A quantifiable measure used to track performance | Average Order Value, Click-Through Rate |
| Dimension | A categorical attribute used to slice data | Region, Product Category, Time Period |