This is a fully stocked sandbox virtual image that is setup up to run Big Data and Data Science applications right on your laptop.

This sandbox is a replica of our cloud-VM that we use for training.  We are making this available so students can practice on their own.

Github home : https://github.com/elephantscale/sandbox

Why?

Running Big Data applications (Spark / Cassandra / Hadoop) can be a little convoluted because of all the dependencies. This can be even more of a hassle in Windows. We hope this VM Sandbox will make things easier.

Where to get it?

Currently OVA based virtual machine image is available.
Docker images coming ‘soon’.
Note : These are LARGE downloads (10G+ in size). Download when you have good bandwidth.

 How to run it?

  • You need a virtual machine ‘player’. Any of these would work:
  • Download the latest sandbox image
  • Double click on the ‘OVA’ file open it.

See github page for all instructions & tutorials

Checkout our Sandbox channel for more videos.

Access

Login : student
password : bigdata123

See intro lab for a screencast.

Connectivity:

  • Use VM GUI : when you open this OVA file in a VM environment you will be logged into the Ubuntu desktop
  • SSH via port 22
  • from host machine
     $ ssh -l student -p 2222 localhost

What can I run?

This VM is tested with following Big Data stack.

  • Spark v1.6 and Spark v2.x
  • BigDL
  • Cassandra v3.x
  • Kafka v0.10
  • Storm v1.x
  • Zookeeper v3.4.8

 

Software Installed

Changelog

See version history in changelog

Feedback

We welcome your feedback about the sandbox.

  • send an email to info@elephantscale.com
  • or open a issue at the Github page