Spark deployment modes
Besides running Spark application in local mode (used only for testing), spark applications can run in different cluster managers:
- Standalone cluster: we can very quickly deploy a standalone cluster with minimal few steps and configurations. I wrote the tutorial here: How to install Apache Spark 1.2.1 as Standalone cluster
- ApacheMesos : A cluster manager developed by UC Berkeley
- HadoopYARN: A popular cluster manager - came with Hadoop MapReduce version 2
Running applications in Cluster mode
In overview, the process of running an application on Spark is as follows:
- Firstly, the user submits an application using
--master: url to the cluster manager: YARN/ MESOS/ Standalone Cluster
--deploy-mode: cluster or client: The option deploy-mode specifies where the Driver Program will be run: either on the Worker Node inside the cluster (cluster mode) (ApplicationMaster in YARN) or on an external client (client mode)
-application-jar: The source code of Driver Program
- The Driver Program runs and creates a
SparkContext. Next, it registers the application to the
DriverProgramwith pre-configured parameters will then ask
ClusterManagerfor resources, which is a list of Executors.
ClusterManageris in charge of allocating available resources for each application submitted. Finally,
DriverProgramsends application code to the allocated
Executorsto run (
- You can notice easily that different Applications run in different
Executors(or we can say
Tasksfrom different Applications run on different JVMs).
- So, within an application,
Tasks. It communicates with its
Executorsto obtain status about the
Jobs. What I mean “schedule” here is: to run
Tasksin order, and re-submit if the
Tasksfail. The results of running
Taskscould be sent back to
rdd.collect(), …) or written to storage (eg
WorkerNodeis a node machine that host one or multiple Workers.
Workeris launched within a single JVM (1 process) and has access to a configured number of resources in the node machine, eg number of used cores per Worker, RAM memory. A Worker can spawn and own multiple
Executoris also launched within a single JVM (1 process), created by the Worker, and in charge of running multiple
In more details:
SparkContext(sc), our application will be able to create some RDDs, apply some transformations on them and then trigger an action. Subsequently,
scwill immediately submit a Job to the
DAGSchedulerwill apply some optimizations here. One noticeable thing is the process of splitting the DAG (Directed Acyclic Graph) into
Tasks) by analyzing the types of dependency of RDDs (See previous post here - pipeline narrow dependencies). The
DAGSchedulerwill decide what to run: the
Tasksincluding stages and partitions.
- Tasks are then scheduled to the acquired Executors driver side
TaskScheduler, according to resources and locality constraints (decides where to run each Task).
- Within a Task, the lineage DAG of corresponding stage is serialized together with closures (functions) of transformation, then sent to and executed on the scheduled Executors.
- The Executors run the Tasks in multiple Threads and finally send back the results to the Driver to aggregate.
Master - Workers
Compared with Hadoop MapReduce
Hadoop MapReduce runs each
Task in its own process; and when a
Task completes, that process goes away (unless we turned on the JVM reuse). Additionally, these Tasks can run concurrently in threads by using the advanced techniques with using class
While in Spark, many Tasks by default will run concurrently in multiple threads in a single process (an
Executor). This process sticks to the Application, even when no jobs are running.
In fact, we can summarize the difference in the execution model between them:
- Each MapReduce Executor is short-lived and expected to run only one large task.
- Each Spark Executor is long-running and expected to run many small tasks.
- Thus, be careful when ports source code from MapReduce to Spark because Spark requires thread safe execution.
So, what are the benefits of the implementation in Spark?
- Able to isolate different applications to run in different JVMs, isolate the tasks scheduling: each driver schedules its own tasks 1.
- Launching a new process cost much more than launching a new thread. Creating threads is 10-100 times faster than creating processes. Additionally, by running multiple small Tasks rather than just a few big Tasks would reduce the impacts of data skew.
- Very suitable for running interactive application: quick computation
Besides these gains, however, there are also adverse impacts:
- Applications cannot share data (mostly Rdds in sparkContext) without writing to external storage 1.
- Resources allocation inefficiency: for the entirety of an Application, the App will take a full and fixed amount of resources that it asked the
ClusterManagerand got accepted, even though that App doesn’t run any Task.
- The dynamic resource allocation feature comes out in Spark 1.2 to address this issue: If a subset of the resources allocated to an application becomes idle, it can be returned to the cluster’s pool of resources and acquired by other applications. 2