Apache Spark Class: Job Execution Model

Or Bar Ilan
3 min readJan 22, 2021

How Apache Spark works behind the scenes.

Apache Spark™ is a unified analytics engine for large-scale data processing. Apache Spark was developed to overcome the drawbacks of the Hadoop MapReduce cluster computing paradigm:

  • Hadoop MapReduce used disk-based processing which slowed down the entire computation, meaning compared to MapReduce, Apache Spark offers much less reading and writing to and from the disk, multi-threaded tasks.
  • Hadoop MapReduce supports only batch processing, while Apache Spark support stream processing too.
  • Apache Spark supports multiple languages and integrations with other popular products (Hadoop MapReduce support only Java language).

We are using Apache Spark when we had machine learning tasks, ETL or SQL batch jobs with large data sets, stream jobs with real-time data, complex session analysis and etc.

So, How Apache Spark Works?

Apache Spark uses Master/Slave architecture that consists of a driver, which runs as a master node. At the first step, the client submits a Spark user application code and then the driver implicitly converts user code contained transformations and actions to a logical plan, which is Directed Acyclic Graph (DAG).

In addition, the driver creates SparkContext which is an entry point for all spark features and allows the Spark user application to access Spark Cluster with the help of Resource Manager.

DAG is vertices and edges, where vertices represent the RDDs and the edges represent the operation to be applied on RDD. On the calling of action, the created DAG submits to DAG Scheduler which further splits the graph into the stages of the task (splitting to stages is done by wide transformations) and this one will be called as the physical execution plan.

Partition: is a logical chunk of your RDD/Dataset. Data is split into partitions so that each executor can operate on a single part, enabling parallelization (it can be processed by a single executor core).

Task: is a single operation (like.map or .filter) applied to a single partition.

Stage: is a sequence of tasks that can all be run together in parallel without a shuffle.

After creating a physical execution plan, the driver creates physical execution units called tasks under each stage. In the end, the tasks are bundled to be sent to the cluster.

Executor: is a JVM that run on worker node (actually runs Tasks on data Partitions). Each task is executed as a single thread in an executor. When executor has only one core, it can perform one task at a time — if it has 2 cores, it can perform 2 tasks at time, and so on… Although, there is an option to use multithreads on one executor, so by context-switch or time-sharing we can execute more than one task on execture core at time.

Taken from: https://thumbs.gfycat.com/AbandonedCelebratedAmericanratsnake-size_restricted.gif

At this point, the driver requests some resources from the cluster manager (which lunches executors in worker nodes on behalf of the driver). Next, the driver will send the tasks to the executors based on data placement, when executors start they register themself with the driver. So, the driver has a complete view of all executors (at any point of time when the application is running, the driver program will monitor the set of executes that runs).

When the job is done, the result returned to the driver and from there to the client.

--

--