-
Notifications
You must be signed in to change notification settings - Fork 1
Hadoop
NOTE: Hadoop compatible version is 2.7.7
NOTE: Hbase compatible version is 1.2.4
append this to /etc/hosts
<server-ip> <hostname>
adduser hadoop
usermod -aG sudo hadoop
for example this is for master node:
127.0.0.1 localhost.localdomain localhost
master-ip server.domain.com server
slave1-ip slave1
slave2-ip slave2
slave3-ip slave3
master-ip master
releases
Download With wget and extract with tar xvf
then mv folder /usr/local/hadoop
then with chown grant access to hadoop folder for hadoop user :
chown -R hadoop:hadoop /usr/local/hadoop
very important: From now on change user to hadoop with su hadoop
change /usr/lib/jvm/jdk1.8.0_211 to Your Java Home Directory
export J2SDKDIR="/usr/lib/jvm/jdk1.8.0_211"
export J2REDIR="/usr/lib/jvm/jdk1.8.0_211/jre"
export JAVA_HOME="/usr/lib/jvm/jdk1.8.0_211"
export DERBY_HOME="/usr/lib/jvm/jdk1.8.0_211/db"
export HADOOP_HOME="/usr/local/hadoop"
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
export PATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/db/bin:$JAVA_HOME/jre/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
WARNING:You must source .bashrc and hadoop-env.sh after changing them
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
export JAVA_HOME="/usr/lib/jvm/jdk1.8.0_211"
export HDFS_NAMENODE_USER="hadoop"
export HDFS_DATANODE_USER="hadoop"
export HDFS_SECONDARYNAMENODE_USER="hadoop"
export YARN_RESOURCEMANAGER_USER="hadoop"
export YARN_NODEMANAGER_USER="hadoop"
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name> # this namenode property is only for master
<value>file:///usr/local/hadoop/hdfs/data</value>
</property>
<property>
<name>dfs.permission</name>
<value>false</value>
</property>
</configuration>
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.datanode.data.dir</name># this datanode property is only for slaves
<value>file:///usr/local/hadoop/hdfs/data</value>
</property>
<property>
<name>dfs.permission</name>
<value>false</value>
</property>
</configuration>
<configuration>
<property>
<name>mapreduce.jobtracker.address</name>
<value>master:54311</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
</configuration>
hadoop namenode -format
If you are using a firewall, you’ll need to open port 9000, 54311, 50070(monitoring hadoop hdfs), 8088(monitoring yarn).
do this for each node;
Before doing su hadoop and make sure it is hadoop@server
Do the following:
ssh-keygen -p ""
check if the key is generated:
cat ~/.ssh/id_rsa.pub
copy this key.
Go to all other nodes and run nano ~/.ssh/authorized_keys and copy the ssh key there.
You can also copy ssh using this:
ssh-copy-id hadoop@hostname.example.com
On master open this file:
nano ~/.ssh/config
and write these:
Host master
HostName [hostname]
User hadoop
IdentityFile ~/.ssh/id_rsa
Host slave1
HostName [hostname]
User hadoop
IdentityFile ~/.ssh/id_rsa
Host slave2
HostName [hostname]
User hadoop
IdentityFile ~/.ssh/id_rsa
Host slave3
HostName [hostname]
User hadoop
IdentityFile ~/.ssh/id_rsa
DO like others.
As master node:
nano $HADOOP_HOME/etc/hadoop/slaves
Write:
localhost
hadoop-worker-01-server-ip
hadoop-worker-02-server-ip
hadoop-worker-03-server-ip
and:
nano $HADOOP_HOME/etc/hadoop/masters
Write:
master
NOTE:Write masters for all nodes.
NOTE: It's maybe necessary to add below line to hadoop-env.sh
export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
In hadoop-env.sh there is the HADOOP_SSH_OPTS environment variable. I'm not really sure what it does, but you are welcome to try and set a port like so.
export HADOOP_SSH_OPTS="-p <num>"
Also not sure about this one, but in hbase-env.sh
export HBASE_SSH_OPTS="-p <num>"
Once done setting all the configs, restart the Hadoop services([Alireza]: Don't use these. They are deprecated.)
stop-all.sh
start-all.sh
[Alireza]: Use these instead:
start-dfs.sh
start-yarn.sh
stop-dfs.sh
stop-yarn.sh
first and the best and this is for ubuntu 16.04
https://linuxconfig.org/how-to-install-hadoop-on-ubuntu-18-04-bionic-beaver-linux