hadoop on ARM (part 2)

Posted: July 10, 2012 in linaro, server

If you’ve viewed part1, be sure to go back and have a relook as I had an error in some of the xml conf files.

Be aware this is a work in progress. What I have here works, but the steps to setup the cluster could use some polish and optimization for ease of use.

While hadoop running on one node is slightly interesting. hadoop running across several ARM nodes, now that’s more like it. It’s time to add in additional nodes. In my case I’m going to have a 4 node hadoop cluster made up of 3 TI panda boards and 1 freescale imx53. Let’s walk through the steps to get that up and running.  At this end of this exercise, there’s a great opportunity to have a hadoop-server image which is mostly setup.

Network Topology

In hadoop you must specify one machine as the master. The rest will be slaves. So across the collection of machines, let’s first get the hostnames all setup and organized. You’ll want to also get all the machines setup with static ip addresses.

One way to setup a static ip address is to edit /etc/network/interfaces. All my machines are on a 192.168.1.x network. You’ll have to make adjustments as are appropriate for your setup. Note in this example the line involving dhcp is commented out:

auto lo
iface lo inet loopback
auto eth0
#iface eth0 inet dhcp
iface eth0 inet static
  address 192.168.1.15
  netmask 255.255.255.0
  gateway 192.168.1.1

Next we’ll update the /etc/hosts file with entries for master and all the slaves across all the machines. Edit /etc/hosts on the respective machines with their ip address to name mappings. Here’s an example from the system that is named master. Note the respective system name appears on first line:

127.0.0.1       localhost master
::1             localhost ip6-localhost ip6-loopback
fe00::0         ip6-localnet
ff00::0         ip6-mcastprefix
ff02::1         ip6-allnodes
ff02::2         ip6-allrouters
192.168.1.14 master
192.168.1.15 slave1
192.168.1.16 slave2
192.168.1.17 slave3

Next it’s time to update the /etc/hostname file on all the systems with it’s name. On my system named master  for instance it’s as simple as this:

master

With that complete, go back to my previous post and repeat those steps for each node. However there are some exceptions.

  1. If you issue start-all.sh make sure you shut down the node with end-all.sh
  2. Reboot each node after you’ve completed.

All done? Good! Now we’re ready to connect the nodes so that work will be dispatched across the cluster. First we net to get hduser’s pub ssh key onto all the slaves. As the hduser on your master node, issue the following for each slave node:

master$ ssh-copy-id -i $HOME/.ssh/id_rsa.pub hduser@slave1

Afterwards test that you can ssh from the master to each of the slave nodes. It’s extremely important this works.

master $ slogin slave1

Multinode hadoop configuration

Now it’s time to configure the various services that will run. Just like in the first blog post, you can have your master node run as both a slave node and a master node, or you can have it run just as a master node. Up to you!

On the master node or nodes, it’s their job to run NodeNode and JobTracker. On the slave nodes, it’s their job to run DataNode and TaskTracker. We’ll now configure this.

On the master machine as root edit /usr/local/hadoop/conf/masters and change to the name of your master machine and localhost:

localhost
master

Now on the master machine we are going to tell it what nodes are it’s slaves. This is done by listing the names of the machines on /usr/local/hadoop/conf/slaves. If you want the master machine to also serve as a slave then you need to list it. If you don’t want the master to be a slave then you should remove the entry for localhost and make sure the name of the master isn’t listed. You have to update this file on all machines in the cluster.

localhost
master
slave1
slave2
slave3

Now for all machines in the cluster, we need to update /usr/local/hadoop/conf/core-site.xml. Specifically look for

<value>hdfs://localhost:54310</value>

and change localhost to the name of your master node.

<value>hdfs://master:54310</value>

Now we’re going to update /usr/local/hadoop/conf/mapred-site.xml again and all machines which specifies where the JobReducer is run. This runs on our master node. Look for

<value>localhost:54311</value>

and change to

<value>master:54311</value>

Next on all nodes edit /usr/local/hadoop/conf/hdfs-site.xml to adjust the value for dfs.replication. This value needs to be equal to or less then the number of slave nodes you have in your cluster. Specifically it controls how many nodes data has to be copied to before it the job starts. The default for the file if unchanged is 3. If the number is larger than the number of slave nodes that you have, your jobs will experience errors. Here’s how mine is setup which is acceptable since I have a total of 4 nodes.

 <value>3</value>

Next on the master node, su – hduser and issue the following to format the hdfs file system.

master ~$ hadoop namenode -format

Starting the cluster daemons

Now it’s time to start the cluster. Here’s the high level order:

  1. HDFS daemons are started, this starts the NameNode daemon on the master node
  2. DataNode daemons are started on all slaves
  3. JobTracker is started on master
  4. TaskTracker daemons are started on all slaves

Here’s the commands in practice. On master as hduser:

hduser@master:~$ start-dfs.h

Presuming things successfully start you can run jps and see the following:

hduser@master:~$ jps
3658 Jps
3203 DataNode
3536 SecondaryNameNode
2920 NameNode

If you were to also run jps on your slave nodes you’d notice that DataNode is also running there.

Now we will start the MapReducer and JobTracker on master. This is done with:

hduser@master:~$ start-mapred.sh 

Presuming you don’t encounter any errors, your cluster should be fully up and running. In my next blog post I’ll do some runs with the cluster and do some performance measurements.
 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s