Native integration between Apache Spark and Volcano Kubernetes scheduler

  1. Local Kubernetes cluster on openEuler OS
  2. Apache Spark on Kubernetes
  3. Apache Spark -SNAPSHOT on Kubernetes
  4. Native integration between Apache Spark and Volcano Kubernetes scheduler (this article)

What is Volcano ?

Volcano is a Kubernetes native system for high-performance workloads, which has been accepted by Cloud Native Computing Foundation (CNCF) as its first and only official container batch scheduling project. Volcano supports popular computing frameworks such as Spark, TensorFlow, PyTorch, Flink, Argo, MindSpore, and PaddlePaddle. Volcano also supports scheduling of computing resources on different architecture, such as x86, Arm, and Kunpeng.

What does Volcano provides ?

  • Gang scheduling — a.k.a. all or nothing. Volcano will try to reserve resources so it is possible to execute all tasks at once, or postpone the execution of the tasks until the required resources are available. It is used for scenarios that require multi-process collaboration.
  • Queue scheduling — hierarchical pools with min/max resources quotas
  • Fair-share scheduling — a scheduling algorithm that makes sure that unrelated queues with same weight have equal access to the underlying resources
  • Preemption scheduling — resource preemption between jobs in a queue , or between tasks in a job
  • Topology-based scheduling — an algorithm that computes the priority of tasks and nodes based on the affinity and anti-affinity configuration between tasks within a job
  • Reclaim — an algorithm that reclaims the resources for a queue when a new task is added to the queue and the allocated resources for it are no more enough
  • Backfill — an algorithm that re-schedules (smaller) jobs as soon as the required amount of resources are made available / freed. It helps for better throughput

That is, depending on your applications’ requirements you could use Volcano to reserve specific resources for them, to prioritize them, to prioritize a specific task(s) over others, etc.

Spark + Volcano integration

Before Spark 3.3.0 the only way to use Spark with Volcano scheduler was the Spark on Kubernetes operator — a Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes. The idea behind it is that you install Kubernetes CRD (Custom Resource Definition) that glues Spark with Volcano.

With Spark 3.3.0 you can take advantage of Volcano’s facilities by adding few additional configuration properties to your spark-submit command!

  1. The first step you have to do is to install Volcano on your Kubernetes cluster:

For x86_64:

$ kubectl apply -f

For aarch64:

$ kubectl apply -f

Note: version 1.5.1 is the minimum required for Spark 3.3.0, but any newer version is also fine!

2. The next step is to add the new Spark config properties:


This property instructs Spark to set volcano as a value for spec.schedulerName for the pods, e.g. their YAML would contain:

apiVersion: v1
kind: Pod
annotations: spark-eb75b0d69737441c9b3d0e784692baee-podgroup
spark-app-name: spark-on-k8s-app
spark-app-selector: spark-eb75b0d69737441c9b3d0e784692baee
spark-role: driver
spark-version: 3.3.0-SNAPSHOT
name: spark-on-k8s-app-3582607fbfefa69e-driver
namespace: spark-on-k8s
schedulerName: volcano

Note: the unrelated properties are removed!

3. The next properties we need to set are:

--conf spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
--conf spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep

VolcanoFeatureStep is responsible to add a metadata annotation to all pods (driver and executor(s)) so that Volcano recognizes them as manageable. See metadata > annotations > above.

In addition this feature step will create a PodGroup, so that all the pods are managed together (Gang scheduling). The name of this PodGroup will be my-application-id-podgroup and it will be added to the same Kubernetes namespace as all other Spark resources (property spark.kubernetes.namespace) but it won’t set resource quotas!

$ kubectl get podgroup -n spark-on-k8s
spark-eb75b0d69737441c9b3d0e784692baee-podgroup 32m
$ kubectl get podgroup/spark-eb75b0d69737441c9b3d0e784692baee-podgroup -n spark-on-k8s -oyaml
kind: PodGroup
creationTimestamp: "2022-03-25T07:17:09Z"
generation: 36
name: spark-eb75b0d69737441c9b3d0e784692baee-podgroup
namespace: spark-on-k8s
- apiVersion: v1
controller: true
kind: Pod
name: spark-on-k8s-app-3582607fbfefa69e-driver
uid: b7b15cc4-a01a-4d84-83cc-5a7d64e79baa
resourceVersion: "897169"
uid: f1737089-6562-40fe-be05-13e9c022670d
minMember: 1
cpu: "5"
memory: 7Gi
priorityClassName: medium
queue: my-queue
- lastTransitionTime: "2022-03-25T07:49:37Z"
reason: tasks in gang are ready to be scheduled
status: "True"
transitionID: 4055ec95-d1e2-4c98-9dbd-8214d3929c24
type: Scheduled
phase: Running
succeeded: 1

To make use of Volcano’s features you should provide a PodGroup template with:

--conf spark.kubernetes.scheduler.volcano.podGroupTemplateFile=/path/to/my-podgroup-template.yaml

The template might contain any valid Volcano PodGroup settings.

Here is a sample template that sets minimum required CPU and memory resources, a PriorityClass and a Queue.

The PriorityClass and the Queue are cluster-wide Kubernetes resources, so they have to be pre-created by a user with the respective permissions.

Priority classes:

$ kubectl apply -f priority-classes.yaml

Volcano Queue:

$ kubectl apply -f my-queue.yaml

Congratulations! You have successfully ran your first Spark job with Volcano!



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store