Blog

# Building the world’s smallest Cassandra cluster

A few weeks ago I received my first Intel Edison in the mail. After a bit of tinkering I was able to bootstrap a small Cassandra cluster on this little SoC. I tweeted a picture showing it up and running with OpsCenter connected in the background.

I will ouline how to bootstrap the Intel Edison and install a functioning Cassandra cluster. Note that even though you can run Cassandra on the Edison, that doesn’t necessarily mean you should.

## What is an Intel Edison?

Image via SparkFun

The Intel Edison is a development platform for embedded applications boasting impressive specs:

• Dual core Intel Atom SoC (500MHz)
• 1GB LPDDR3 RAM
• 802.11 a/b/g/n Wifi
• Bluetooth 4.0 LE

## Components

In this project we’re going to need to provide power, serial console access, and some additional storage capacity. These features can be stacked on to the Edison through blocks. If you’re framiliar with Arduino shields, the concept is the same.

At the time of publication a single node in our cluster costs approximately \$119 USD.

## Node Assembly

Assembled Node

After all of the components have arrived it is time to start assembling the node. Each node is comprised of 3 layers (Edison => Console Block => MicroSD Block) which are connected via 70-pin connectors on the top and bottom of the blocks. The order of the Console and MicroSD blocks are not important, but the Edison must be placed on the top. In our assembly we have placed the MicroSD block on the bottom as fewer components are exposed in this configuration.

1. Begin assembly by attaching two standoffs from the hardware pack onto the top of the console board with the small phillips head bolts. This will expose two posts that will support the Edison.
2. Attach 4 standoffs to the bottom of the board with nuts
3. Connect the Edison to the Console block and affix it to the standoffs with two nuts.
4. Next add the MicroSD block securing it with 4 additional standoffs.

## Bootstrap the Edison

With the node assembled it is time to start setting up the software side. Intel provides a step by step getting started guide. For our setup we should configure the password and connect the device to our wifi network.

1. Connect to the Edison via the USB serial device. Note that on your device the value A402YSYU may be different

  screen /dev/tty.usbserial-A402YSYU 115200 –L
2. Press enter twice and a login prompt should appear
3. Type root and press enter
4. Run the configure_edison program. This will prompt you for a name for the device, a new root password and walkthrough the wifi setup process.
5. Prepare the MicroSD card. We first unmount the card (which is auto-mounted at /media/sdcard). Next we run fdisk and repartition the card. Finally we create the filesystem.

 umount /media/sdcard fdisk /dev/mmcblk1 d                         # Delete the current partition (There is usually only 1 if any) n                         # Create a new partition p                         # Mark the partition as primary 1                         # Assign it the number 1                    # Select the first block on the device                    # Select the last block on the device w                         # Write the change to the disk and exit mkfs.ext4 /dev/mmcblk1p1
6. Setup the Cassandra data directory

 mount /dev/mmcblk1p1 /media/sdcard mkdir /media/sdcard/cassandra
7. Type exit and press enter. Future setup will be performed over SSH.

## Install Java

Cassandra is written in java and requires the JRE to run. It is recommended that we use Oracle’s JRE version 7.

1. Download the jre-7u75-linux-i586.tar.gz from the Oracle download page. This requires accepting a license agreement before downloading the file.
2. Copy the file to the Edison via SFTP
3. Extract the file

  tar xvzpf jre-7u75-linux-i586.tar.gz
4. Set the JAVA_HOME environment variable to point at the extracted directory

  export JAVA_HOME=/home/root/jre1.7.0_75

It may be worthwhile to place this in your .bash_profile to prevent having to run this every time we start Cassandra.

## Install getopt

Cassandra’s start script utilizes the getopt command. The Yocto Linux distribution running on the Edison does not ship with this utility. This package may be found through Yocto’s Recipe reporting system. From here there is a link to the source code which we will pull down and compile on the Edison.

1. Download the util-linux package from kernel.org

 wget http://kernel.org/pub/linux/utils/util-linux/v2.25/util-linux-2.25.2.tar.xz
2. Extract the package

 tar xvf util-linux-2.25.2.tar.xz
3. Build the source

 cd util-linux ./configure make getopt
4. Install the built binary

 mkdir -p /usr/local/bin cp getopt /usr/local/bin

## Install & Configure Apache Cassandra

 wget http://mirrors.gigenet.com/apache/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz
2. Extract the package

 tar xvzpf apache-cassandra-2.1.3-bin.tar.gz
3. Configure Cassandra – be sure to edit the values listed below instead of replacing the file

cassandra.yaml

 cluster_name: "Edison Cluster" listen_address:  rpc_address:      seed_provider:     - class_name: org.apache.cassandra.locator.SimpleSeedProvider       parameters:           - seeds: " ip here>" data_file_directories:     - /media/sdcard/cassandra/data commitlog_directory: /media/sdcard/cassandra/commitlog saved_caches_directory: /media/sdcard/cassandra/saved_caches

cassandra-env.sh

 MAX_HEAP_SIZE="512M" HEAP_NEWSIZE="200M"

logback.xml

  

/etc/hosts – The hostname must be in /etc/hosts or DNS

  
4. Start Cassandra

 /home/root/apache-cassandra-2.1.3/bin/cassandra

## Intel Edison Micro-Cluster Benchmarks

Cluster Status

nodetool -h 192.168.1.50 statusDatacenter: datacenter1=======================Status=Up/Down|/ State=Normal/Leaving/Joining/Moving--  Address       Load       Tokens  Owns (effective)  Host ID                               RackUN  192.168.1.50  70.54 KB   256     100.0%            2395b793-2d8b-4c31-931c-64fa94097cc5  rack1

Cassandra Stress Writes

./cassandra-stress write n=100000 -node 192.168.1.50Created keyspaces. Sleeping 1s for propagation.Warming up WRITE with 50000 iterations...INFO  18:36:16 Using data-center name 'datacenter1' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor)Connected to cluster: Edison ClusterDatatacenter: datacenter1; Host: /192.168.1.50; Rack: rack1INFO  18:36:16 New Cassandra host /192.168.1.50:9042 addedSleeping 2s...Running WRITE with 200 threads for 100000 iteration...Results:op rate                   : 1922partition rate            : 1922row rate                  : 1922latency mean              : 104.0latency median            : 87.9latency 95th percentile   : 176.9latency 99th percentile   : 381.7latency 99.9th percentile : 696.5latency max               : 1855.3total gc count            : 4total gc mb               : 569total gc time (s)         : 2avg gc time(ms)           : 474stdev gc time(ms)         : 19Total operation time      : 00:00:52END

./cassandra-stress read n=100000 -node 192.168.1.50Warming up READ with 50000 iterations...Failed to connect over JMX; not collecting these statsINFO  18:41:53 Using data-center name 'datacenter1' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor)Connected to cluster: Edison ClusterINFO  18:41:53 New Cassandra host /192.168.1.50:9042 addedDatatacenter: datacenter1; Host: /192.168.1.50; Rack: rack1Sleeping 2s......         id, total ops , adj row/s,    op/s,    pk/s,   row/s  4 threads, 100000    ,       238,     238,     238,     238  8 threads, 100000    ,        -0,     399,     399,     399 16 threads, 100000    ,       588,     588,     588,     588 24 threads, 100000    ,       725,     722,     722,     722 36 threads, 100000    ,        -0,     873,     873,     873 54 threads, 100000    ,        -0,    1096,    1096,    1096 81 threads, 100000    ,        -0,    1247,    1247,    1247121 threads, 100000    ,        -0,    1376,    1376,    1376181 threads, 100000    ,        -0,    1444,    1444,    1444271 threads, 99688     ,        -0,    1532,    1532,    1532406 threads, 100000    ,        -0,    1515,    1515,    1515609 threads, 100000    ,        -0,    1500,    1500,    1500913 threads, 100000    ,      1509,    1501,    1501,    1501END