Skip to content
MostafaOjaghi edited this page Jul 29, 2019 · 5 revisions

Spark 2.4.3

Download

Download and extract Spark in path you want to install it on.

Prerequisites

Add entries in hosts file (master and slaves)

Edit hosts file.

$ sudo vim /etc/hosts

Now add entries of master and slaves in hosts file.

<MASTER-IP> master
<SLAVE01-IP> slave1
<SLAVE02-IP> slave2
<SLAVE02-IP> slave3

Configure SSH (only master)

Generate key pairs:

$ ssh-keygen -t rsa -P ""

Configure passwordless SSH:

ssh-copy-id remote_username@server_ip_address -p port

Check by SSH to all the slaves

$ ssh slave1
$ ssh slave2
$ ssh slave3

Install

Spark Master Configuration

Do the following procedures only in master.

Edit spark-env.sh

Move to spark conf folder and create a copy of template of spark-env.sh and rename it.

$ cd /usr/local/spark/conf
$ cp spark-env.sh.template spark-env.sh

Now edit the configuration file spark-env.sh.

$ sudo vim spark-env.sh

And set the following parameters.

export SPARK_MASTER_HOST='<MASTER-IP>'
export JAVA_HOME=<Path_of_JAVA_installation>

note: you need to set java home in slaves too.

Add Workers

Edit the configuration file slaves in (/usr/local/spark/conf).

$ sudo vim slaves

And add the following entries.

slave1
slave2
slave3

Start Spark Cluster

To start the spark cluster, run the following command on master.

$ cd /usr/local/spark
$ ./sbin/start-all.sh

To stop the spark cluster, run the following command on master.

$ cd /usr/local/spark
$ ./sbin/stop-all.sh

Check whether services have been started

To check daemons on master and slaves, use $ jps.

Spark Web UI

Browse the Spark UI to know about worker nodes, running application, cluster resources.

Spark Master UI

http://<MASTER-IP>:8080/

Spark Application UI

http://<MASTER_IP>:8040/

Links

https://medium.com/ymedialabs-innovation/apache-spark-on-a-multi-node-cluster-b75967c8cb2b

Clone this wiki locally