Installing Tachyon (In-Memory-File-System) As A Cluster

This post shows how to setup and configure Tachyon as a cluster.

Quick Pointers:

We are using version 0.6.4

1 – Layout

We will have one master and 5 workers (master is also doubling as a worker!).  

2 – Process

  1. configure files on master
  2. push config files from master to all workers
  3. start Tachyon cluster from master

3 – Installation Pre-Requisites

3.1 –  Linux Machines

We have provisioned Linux  machines on Amazon AWS.  Here is our configuration

  • OS : CentOS 6.4 (AMI image ???)
  • instance type : m3.large
  • CPU : 2 cores
  • Memory : 7G

We will refer the machines as

  • master / worker1
  • worker2
  • worker3
  • worker4
  • worker5

3.2 – Setting up Machines for Password-less SSH

Only do this, if you don’t have any existing SSH setup between machines.  Master should be able to login to all workers via SSH.

Step 1 : Login to master

Step 2 : Create SSH key

$ ssh-keygen

Step 3 : Setup password less login

$ ssh-copy-id    worker2

enter credentials.

If you do not have the login with username/password credentials, but must login with the key (as is the case with fresh EC2 instances), you can use this command

$ cat ~/.ssh/id_rsa.pub | ssh -i your-key worker1 "mkdir -p ~/.ssh && cat >>  ~/.ssh/authorized_keys"
The command above assumes that you have temporarily copied your-key, which you used to login into the master instance, to its home directory. Remove this key after you are done.

now repeat this for all workers
$ ssh-copy-id  worker2
 ..... so on....

You may have to add the key manually for the 'master' host - that one on which you do the work. Then
$ ssh localhost
will also work.

3.3 – Testing SSH-password-less login

Step 1 :
Let’s create a $HOME/hosts file with the following contents.  We want to fill ip-addresses of all hosts in the cluster

master ip address
worker2 ip address
worker3 ip address
worker4 ip address
worker5 ip address

We have a utility script called ‘cluster-cmd.sh‘ .  It will execute commands on remote nodes.

Get the code and execute it

$  wget https://raw.githubusercontent.com/elephantscale/tachyon-exploration/master/cluster-cmd.sh 
$  chmod 755 cluster-cmd.sh 
$  ./cluster-cmd.sh   -h hosts  ls

This would execute `ls` command on all nodes specified on hosts files.  The following is  a sample output…

==== worker 1 ==== 
tmp
 ==== worker 2 === 
tmp
 .... and so on

(you may have to add the host to /etc/hosts)

Getting, Installing and Configuring Tachyon

from master node

step 1 : get tachyon

$   wget https://github.com/amplab/tachyon/releases/download/v0.6.4/tachyon-0.6.4-bin.tar.gz 
$   tar xvf tachyon-0.6.4-bin.tar.gz 
$   mv tachyon-0.6.4/ tachyon 

So our tachyon installation directory is   $HOME/tachyon

Step 2 : Configure Tachyon

We will configure Tachyon master and push changes out to all workers nodes

Step 2A :

Use config script to generate a config file Config file : $HOME/tachyon/conf/tachyon-env.sh

$ ~/tachyon/bin/tachyon bootstrap-conf <tachyon_master_hostname> 

For this use master’s public IP address (not internal IP address)

Here is my sample config file Three things that needs to be set

  1. JAVA_HOME  : make sure to set this on top of the script!
  2. TACHYON_MASTER_ADDRESS : set by bootstrap-conf
  3. TACHYON_WORKER_MEMORY_SIZE : set by bootstrap-conf

look for XXX tag in config file

Note: if you don’t have Java installed (like on fresh EC2 instances), then now is a good time to install it on all hosts.

Step 2B : Edit  $HOME/tachyon/conf/workers

This file will have ip addresses of all worker nodes, one line at a time.  In our case, this is the same file as $HOME/hosts.

$ cp    ~/hosts   ~/tachyon/conf/workers

Step  3 : Distribute Config files to all nodes

For this we are going to use a handy script.
Here is the copy-files.sh script on github

Execute it like this

$ wget https://raw.githubusercontent.com/elephantscale/tachyon-exploration/master/copy-files.sh
$ chmod 755 copy-files.sh
$  ./copy-files.sh

This will push out tachyon directory to all nodes.
After this all nodes will have tachyon installed at : $HOME/tachyon

Step 4 : Let’s verify tachyon files are installed on all nodes

$ ./cluster-cmd.sh   -h hosts   ls

and you should see tachyon directory listed on all nodes.

Now we are all set to run Tachyon!

Running Tachyon

step 1: format Tachyon file system

This is a destructive command, as in you will loose all files stored in Tachyon File System.  Beware!

$ ~/tachyon/bin/tachyon format

Your output may look like this….

Formatting Tachyon Worker @ worker1
Formatting Tachyon Worker @ worker2
....

Step 2 : Manual fix for a format bug

Format command does not create data directory in storage directory.  We will create it manually

$ ~/cluster-cmd.sh  -h hosts "mkdir -p $HOME/tachyon/underfs/tmp/tachyon/data"

Step 3 : Let’s start Tachyon!

$  ~/tachyon/bin/tachyon-start.sh all SudoMount

If every thing went OK, we will see RAMDISK mounts on all nodes.

$ ~/cluster-cmd.sh -h hosts  "(mount | grep ramdisk)"

We should see output like this…

====== worker1 ======
ramfs on /mnt/ramdisk type ramfs (rw,size=5009266kb)
====== worker2 ======
ramfs on /mnt/ramdisk type ramfs (rw,size=5009266kb)
......

Step 4 : Check UI

Tachyon Master UI is available on port 19999 of Master

So go to :   http://master-host-ip-address:19999

in your browser.  You will see something like below….

Tachyon1

tachyon2

Yay! we have Tachyon up and running

 

Testing / Kicking The Tires

Step 1: Let’s copy some files into Tachyon.

For this, we will use   $HOME/tachyon/bin/tachyon command-line client.

$ ~/tachyon/bin/tachyon tfs copyFromLocal  ~/hosts   /hosts

Inspect the file using File Browser

Tachyon3

Step 2: Creating some test data

We will create some large enough data files to copy into Tachyon

# the following will create a 1G file
$  dd if=/dev/zero of=1G bs=1M count=1000
# 2G file
$  dd if=/dev/zero of=2G bs=1M count=2000

copy to tachyon

$ ~/tachyon/bin/tachyon tfs copyFromLocal  1G    /1a
# copy again
$ ~/tachyon/bin/tachyon tfs copyFromLocal  1G    /1b

Check File Browser in Tachyon UI.

Tachyon4

As you can see the file is only copied to ONE Tachyon node.  The ‘copyFromLocal’ command only copies to local node.

Let’s test this:

Let’s login to another worker node and copy a file

$  ssh worker2
# create a 1G file
$ dd if=/dev/zero of=1G bs=1M count=1000
$ ~/tachyon/bin/tachyon tfs copyFromLocal  1G    /2a

And checkout the UI again. As you can see the files on local nodes only.

Tachyon5

 

 

Tachyon6

 

Conclusion

In this post, we have showed you how to install Tachyon as a cluster and use it.

 

 

Tim Fox
Written by:

Tim Fox

Tim Fox is an AI and Data Engineering consultant focused on engineering solutions in Artificial Intelligence, Machine Learning, Big Data Architecture, Data Science, and Analytics.

Leave a Reply

Your email address will not be published. Required fields are marked *