Siksha Sarovar

Siksha Sarovar (sikshasarovar.com) is a free educational web application that helps students in India learn programming and prepare for academic and competitive exams. The platform offers structured coding courses (C, C++, Python, Java, HTML, CSS, PHP, Power BI, AI, Machine Learning, Data Science), complete university curriculum notes for BCA/MCA students with previous year question papers, Class 10 and Class 12 CBSE/HBSE school notes, and dedicated preparation material for SSC, UPSC, Banking, Railway and other government exams. Browsing the site is completely free and requires no account. Users may optionally sign in with Google solely to save their learning progress, quiz scores and personal preferences across devices.

Privacy Policy | Terms of Service | Contact Siksha Sarovar | About Siksha Sarovar

v4.0.9 · PWA
Siksha Sarovar logo
Siksha Sarovar
Your Learning Universe

Siksha Sarovar is a free e-learning platform for coding courses, BCA university notes and competitive exam preparation. Optional Google sign-in saves your learning progress across devices.

Initializing knowledge base…
Compiling modules 0%

2.4 Distribution Models: Scaling Big Data

Lesson 13 of 36 in the free Big Data-1 notes on Siksha Sarovar, written by Rohit Jangra.

2.4.1 Sharding: Horizontal Partitioning

Sharding is the process of splitting a large dataset across multiple database servers (shards).

  • How it works: A "Sharding Key" decides which server gets which data (e.g., users with last names A-M go to Server 1).
  • Benefit: You can increase storage and processing power linearly by adding more servers.
  • Challenge: "Hot Shards"—if everyone's name starts with 'S', Server 2 will be overloaded while Server 1 is idle.

2.4.2 Replication Models

Replication involves making multiple copies of the data on different servers.

1. Master-Slave Replication

  • Master: All Writes go here.
  • Slaves: They copy from the Master. Used for Reads.
  • Pros: Excellent for read-heavy apps (e.g., a news site). If a Slave fails, others serve the data.
  • Cons: If the Master fails, the DB is "ReadOnly" until a new Master is elected.

2. Peer-to-Peer Replication

  • Concept: Every server can accept both Reads and Writes.
  • Pros: No single point of failure (High Availability). Excellent for global write-heavy apps.
  • Cons: Data inconsistency—Server A might write X, while Server B writes Y simultaneously.
Distribution StrategyScaling TypeAvailability
Single ServerScale UpLow
Sharding OnlyScale OutHigh (Capacity) / Low (Fault Tolerance)
Replication OnlyScale UpHigh (Read availability)
Sharding + ReplicationScale OutThe Best of Both Worlds.