Siksha Sarovar

Siksha Sarovar (sikshasarovar.com) is a free educational web application that helps students in India learn programming and prepare for academic and competitive exams. The platform offers structured coding courses (C, C++, Python, Java, HTML, CSS, PHP, Power BI, AI, Machine Learning, Data Science), complete university curriculum notes for BCA/MCA students with previous year question papers, Class 10 and Class 12 CBSE/HBSE school notes, and dedicated preparation material for SSC, UPSC, Banking, Railway and other government exams. Browsing the site is completely free and requires no account. Users may optionally sign in with Google solely to save their learning progress, quiz scores and personal preferences across devices.

Privacy Policy | Terms of Service | Contact Siksha Sarovar | About Siksha Sarovar

v4.0.9 · PWA
Siksha Sarovar logo
Siksha Sarovar
Your Learning Universe

Siksha Sarovar is a free e-learning platform for coding courses, BCA university notes and competitive exam preparation. Optional Google sign-in saves your learning progress across devices.

Initializing knowledge base…
Compiling modules 0%

Elements of Parallel Computing

Lesson 19 of 30 in the free Cloud Computing notes on Siksha Sarovar, written by Rohit Jangra.

Elements of Parallel Computing

Before writing a single line of parallel code, you must understand the building blocks: concurrency, parallelism, granularity, speedup, and efficiency. These concepts let you reason about whether a problem is parallelizable and how much gain to expect.

Concurrency vs. Parallelism

These two terms are frequently confused:

  • Concurrency is the composition of independently-executing processes or tasks. It is a structural property — the program is designed so that multiple things could run at the same time. A single-core CPU can be concurrent (interleaving tasks via context switching).
  • Parallelism is the simultaneous execution of multiple computations. It is a runtime property that requires multiple physical processors or cores.
"Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once." — Rob Pike (Go language designer)

A web server handling 1,000 requests concurrently on 4 cores is both concurrent (structured for independent tasks) and parallel (4 tasks literally run at the same moment).

Granularity

Granularity describes the ratio of computation to communication in a parallel task.

GranularityCharacteristicsExample
Fine-grainedMany small tasks, frequent communicationGPU pixel shaders
Medium-grainedModerate task size, some sync neededOpenMP loop parallelism
Coarse-grainedFew large tasks, rare communicationMPI processes on HPC cluster

Choosing the wrong granularity wastes time: fine-grained tasks on a slow network drown in communication overhead; coarse-grained tasks on a shared-memory machine underutilize cores.

Speedup

Speedup measures how much faster a parallel program runs compared to its sequential version:

Speedup(N) = T(1) / T(N)

Where T(1) = sequential execution time, T(N) = parallel execution time on N processors.

  • Linear speedup: Speedup(N) = N (ideal, rarely achieved)
  • Super-linear speedup: Speedup(N) > N (possible when parallel version benefits from cache effects)
  • Sub-linear speedup: Speedup(N) < N (typical, due to overhead and serial sections)

Efficiency

Efficiency normalizes speedup by the number of processors:

Efficiency(N) = Speedup(N) / N

An efficiency of 1.0 (100%) means every processor is perfectly utilized. In practice, efficiencies of 0.7–0.9 are considered good for large-scale systems.

Parallelism Lifecycle

Dependencies: The Enemy of Parallelism

Not all computations can be parallelized. Data dependencies (one task needs the result of another) create sequential bottlenecks. Techniques to reduce dependencies:

  • Loop transformations (loop unrolling, tiling)
  • Pipelining (overlap stages)
  • Speculative execution (guess a result, correct if wrong)

Real-World Example

Apache Spark decomposes a data processing job into a DAG (Directed Acyclic Graph) of tasks. Each stage is a coarse-grained parallel operation (e.g., map over all partitions). Within each executor (JVM process), fine-grained thread-level parallelism may be used. Spark's scheduler analyzes dependencies to determine which tasks can run concurrently — exactly the principles described here.