Steps to install Hadoop 2.2.0 Stable release (Single Node Cluster)

Steps to install Hadoop 2.2.0 release (Yarn) on single node cluster setup

1. Prerequisites:

  1. Java 6

  2. Dedicated unix user(hadoop) for hadoop

  3. SSH configured

  4. hadoop 2.x tarball ( )

Lazy ?? : Check this

2.Installation

 
$ tar -xvzf hadoop-2.2.0.tar.gz
 
$ mv hadoop-2.2.0 /home/hadoop/yarn/hadoop-2.2.0
 
$ cd /home/hadoop/yarn
 
$ sudo chown -R hadoop:hadoop hadoop-2.2.0
 
$ sudo chmod -R 755 hadoop-2.2.0


3. Setup Environment Variables in .bashrc (optional)

# Setup for Hadoop 2.0 .
export HADOOP_HOME=$HOME/yarn/hadoop-2.2.0
export HADOOP_MAPRED_HOME=$HOME/
yarn
/hadoop-2.2.0
export HADOOP_COMMON_HOME=$HOME/
yarn
/hadoop-2.2.0
export HADOOP_HDFS_HOME=$HOME/
yarn
/hadoop-2.2.0
export YARN_HOME=$HOME/
yarn
/hadoop-2.2.0
export HADOOP_CONF_DIR=$HOME/yarn/hadoop-2.2.0/etc/hadoop

After Adding these lines at bottom of the .bashrc file
$ source .bashrc

4. Create Hadoop Data Directories

# Two Directories for name node and datanode .
$ mkdir -p $HOME/yarn/yarn_data/hdfs/namenode
$ mkdir -p $HOME/yarn/yarn_data/hdfs/datanode

5. Configuration

# Base Directory .
$
cd $YARN_HOME
 
$ vi etc/hadoop/yarn-site.xml
Add the following contents inside configuration tag
# etc/hadoop/yarn-site.xml .
<property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
</property>
<property>
  <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
  <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
 
$ vi etc/hadoop/core-site.xml
Add the following contents inside configuration tag
# etc/hadoop/core-site.xml .
 
fs.default.name
 
hdfs://localhost:9000
$ vi etc/hadoop/hdfs-site.xml
Add the following contents inside configuration tag
# etc/hadoop/hdfs-site.xml . 
 
dfs.replication
 
1
 
dfs.namenode.name.dir
 
file:/home/hadoop/yarn/yarn_data/hdfs/namenode
 
dfs.datanode.data.dir
 
file:/home/hadoop/yarn/yarn_data/hdfs/datanode
 
$ vi etc/hadoop/mapred-site.xml
If this file does not exist, create it and paste the content provided below:
# etc/hadoop/mapred-site.xml .
 
     
mapreduce.framework.name
     
yarn
 

6. Format namenode(Onetime Process)

# Command for formatting Name node.$ bin/hadoop namenode -format

7. Starting HDFS processes and Map-Reduce Process

# HDFS(NameNode & DataNode).$ sbin/hadoop-daemon.sh start namenode$ sbin/hadoop-daemon.sh start datanode
# MR(Resource Manager, Node Manager & Job History Server).$ sbin/yarn-daemon.sh start resourcemanager$ sbin/yarn-daemon.sh start nodemanager$ sbin/mr-jobhistory-daemon.sh start historyserver

8. Verifying Installation

$ jps
# Console Output.22844 Jps28711 DataNode29281 JobHistoryServer28887 ResourceManager29022 NodeManager28180 NameNode

Running Word count Example Program

$ mkdir input
$ cat > input/fileThis is word count exampleusing hadoop 2.2.0
Add input directory to HDFS:
$ bin/hadoop hdfs -copyFromLocal input /input
Note : If hdfs shows error use dfs instead
Run wordcount example jar provided in HADOOP_HOME:
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /input /output
Check the output at web interface
http://localhost:50070
Browse HDFS dir for /output folder

8. Verifying Installation

# Commands.$ sbin/hadoop-daemon.sh stop namenode$ sbin/hadoop-daemon.sh stop datanode$ sbin/yarn-daemon.sh stop resourcemanager$ sbin/yarn-daemon.sh stop nodemanager$ sbin/mr-jobhistory-daemon.sh stop historyserver