Steps to install Hadoop 2.2.0 release (Yarn) on single node cluster setup
Check the output at web interface Browse HDFS dir for /output folder
1. Prerequisites:
Java 6
Dedicated unix user(hadoop) for hadoop
SSH configured
hadoop 2.x tarball ( )
Lazy ?? : Check this
2.Installation
$ tar -xvzf hadoop-2.2.0.tar.gz
$ mv hadoop-2.2.0 /home/hadoop/yarn/hadoop-2.2.0
$ cd /home/hadoop/yarn
$ sudo chown -R hadoop:hadoop hadoop-2.2.0
$ sudo chmod -R 755 hadoop-2.2.0
3. Setup Environment Variables in .bashrc (optional)
# Setup for Hadoop 2.0 .
export HADOOP_HOME=$HOME/yarn/hadoop-2.2.0 export HADOOP_MAPRED_HOME=$HOME/ yarn /hadoop-2.2.0 export HADOOP_COMMON_HOME=$HOME/ yarn /hadoop-2.2.0 export HADOOP_HDFS_HOME=$HOME/ yarn /hadoop-2.2.0 export YARN_HOME=$HOME/ yarn /hadoop-2.2.0 export HADOOP_CONF_DIR=$HOME/yarn/hadoop-2.2.0/etc/hadoop
After Adding these lines at bottom of the .bashrc file
$ source .bashrc
4. Create Hadoop Data Directories
# Two Directories for name node and datanode .
$ mkdir -p $HOME/yarn/yarn_data/hdfs/namenode
$ mkdir -p $HOME/yarn/yarn_data/hdfs/datanode
5. Configuration
# Base Directory .
$ cd $YARN_HOME
$ vi etc/hadoop/yarn-site.xml
Add the following contents inside configuration tag # etc/hadoop/yarn-site.xml .
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
$ vi etc/hadoop/core-site.xml
Add the following contents inside configuration tag # etc/hadoop/core-site.xml .fs.default.name hdfs://localhost:9000
$ vi etc/hadoop/hdfs-site.xmlAdd the following contents inside configuration tag
# etc/hadoop/hdfs-site.xml .dfs.replication 1 dfs.namenode.name.dir file:/home/hadoop/yarn/yarn_data/hdfs/namenode dfs.datanode.data.dir file:/home/hadoop/yarn/yarn_data/hdfs/datanode
$ vi etc/hadoop/mapred-site.xml
If this file does not exist, create it and paste the content provided below: # etc/hadoop/mapred-site.xml .mapreduce.framework.name yarn
6. Format namenode(Onetime Process)
# Command for formatting Name node.$ bin/hadoop namenode -format
7. Starting HDFS processes and Map-Reduce Process
# HDFS(NameNode & DataNode).$ sbin/hadoop-daemon.sh start namenode$ sbin/hadoop-daemon.sh start datanode
# MR(Resource Manager, Node Manager & Job History Server).$ sbin/yarn-daemon.sh start resourcemanager$ sbin/yarn-daemon.sh start nodemanager$ sbin/mr-jobhistory-daemon.sh start historyserver
8. Verifying Installation
$ jps
# Console Output.22844 Jps28711 DataNode29281 JobHistoryServer28887 ResourceManager29022 NodeManager28180 NameNode
Running Word count Example Program
$ mkdir input
$ cat > input/fileThis is word count exampleusing hadoop 2.2.0
Add input directory to HDFS:
$ bin/hadoop hdfs -copyFromLocal input /input
Note : If hdfs shows error use dfs instead
Run wordcount example jar provided in HADOOP_HOME: $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /input /output
http://localhost:50070
8. Verifying Installation
# Commands.$ sbin/hadoop-daemon.sh stop namenode$ sbin/hadoop-daemon.sh stop datanode$ sbin/yarn-daemon.sh stop resourcemanager$ sbin/yarn-daemon.sh stop nodemanager$ sbin/mr-jobhistory-daemon.sh stop historyserver