Apache Spark™ K8s Operator
Apache Spark™ K8s Operator is a subproject of Apache Spark and aims to extend K8s resource manager to manage Apache Spark applications and clusters via Operator Pattern.
Releases
Requirements
- Apache Spark 3.5+
- Kubernetes 1.30+ cluster
- Helm 3.0+
Install Helm Chart
$ helm repo add spark https://apache.github.io/spark-kubernetes-operator
$ helm repo update
$ helm install spark spark/spark-kubernetes-operator
$ helm list
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
spark default 1 2025-05-14 11:55:22.341181 -0700 PDT deployed spark-kubernetes-operator-0.1.0 0.1.0
Run Spark Pi App
$ kubectl apply -f https://apache.github.io/spark-kubernetes-operator/pi.yaml
sparkapplication.spark.apache.org/pi created
$ kubectl get sparkapp
NAME CURRENT STATE AGE
pi ResourceReleased 4m10s
$ kubectl delete sparkapp pi
sparkapplication.spark.apache.org "pi" deleted
Run Spark Connect Server (A long-running app)
$ kubectl apply -f https://apache.github.io/spark-kubernetes-operator/spark-connect-server.yaml
sparkapplication.spark.apache.org/spark-connect-server created
$ kubectl get sparkapp
NAME CURRENT STATE AGE
spark-connect-server RunningHealthy 14h
$ kubectl delete sparkapp spark-connect-server
sparkapplication.spark.apache.org "spark-connect-server" deleted
Run Spark Cluster
$ kubectl apply -f https://raw.githubusercontent.com/apache/spark-kubernetes-operator/refs/tags/v0.1.0/examples/prod-cluster-with-three-workers.yaml
sparkcluster.spark.apache.org/prod created
$ kubectl get sparkcluster
NAME CURRENT STATE AGE
prod RunningHealthy 10s
$ kubectl delete sparkcluster prod
sparkcluster.spark.apache.org "prod" deleted
Clean Up
Check the existing Spark applications and clusters. If exists, delete them.
$ kubectl get sparkapp
No resources found in default namespace.
$ kubectl get sparkcluster
No resources found in default namespace.
Remove HelmChart and CRDs.
$ helm uninstall spark
$ kubectl delete crd sparkapplications.spark.apache.org
$ kubectl delete crd sparkclusters.spark.apache.org