-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Overview
工作中碰到需要在容器内提交 Spark 的应用,记录下容器配置相关的一些信息。
主要通过环境变量的方式将信息传递给容器中的执行脚本。
Spark configuration
| Property Name | desc |
|---|---|
| spark.driver.bindAddress | 容器 IP 地址 或直接填写 0.0.0.0 |
| spark.driver.host | 主机 IP 地址 |
| spark.driver. port | 驱动器监听端口号 |
| spark.ui.port | 应用程序dashboard的端口 |
| spark.blockManager.port | 块管理器监听的端口 |
Docker
docker run \
-ti \
--rm \
-p 5000-5010:5000-5010 \
-e SPARK_DRIVER_PORT=5001 \
-e SPARK_UI_PORT=5002 \
-e SPARK_BLOCKMGR_PORT=5003 \
-e SPARK_DRIVER_HOST="host.domain" \
spark-driverKubenetest
注:以下方案只适用于 Spark 集群在 kubernetes 集群外
apiVersion: v1
kind: ConfigMap
metadata:
name: spark-dirver-config
data:
SPARK_CONF_DIR: "/etc/spark/conf"
HADOOP_CONF_DIR: "/etc/hadoop/conf"
HADOOP_USER_NAME: hadoop
SPARK_DRIVER_PORT: "5001"
SPARK_UI_PORT: "5002"
SPARK_BLOCKMGR_PORT: "5003"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: spark-dirver-deploy
spec:
replicas: 1
selector:
matchLabels:
app: spark-dirver
template:
metadata:
labels:
app: spark-dirver
spec:
containers:
- name: spark-driver-container
image: spark-driver
imagePullPolicy: Always
command: ["/bin/start.sh"]
ports:
- name: app
containerPort: 3000
# hostPort 直接将容器的端口与所调度的节点上的端口路由
- name: spark-driver
containerPort: 5001
hostPort: 5001
- name: spark-ui
containerPort: 5002
hostPort: 5002
- name: spark-blockmgr
containerPort: 5003
hostPort: 5003
env:
# 通过 Downward API 将 Pod 宿主的 IP 注入到容器的环境变量中
- name: SPARK_DRIVER_HOST
valueFrom:
fieldRef:
fieldPath: status.hostIP
envFrom:
- configMapRef:
name: spark-dirver-config
volumes:
- name: spark-config-volume
configMap:
defaultMode: 0744
name: spark-dirver-conf
- name: hadoop-config-volume
configMap:
defaultMode: 0744
name: hadoop-conf