Practice our BigData stack on your Laptop
This is a fully stocked sandbox virtual image that is setup up to run Big Data and Data Science applications right on your laptop. This sandbox is a replica of our cloud-VM that we use for training. We are making this available so students can practice on their own.
Why you need the sandbox
Running Big Data applications (Spark / Cassandra / Hadoop) can be a little convoluted because of all the dependencies. This can be even more of a hassle in Windows. We hope this VM Sandbox will make things easier.
Where to get it?
Currently OVA based virtual machine image is available.
Docker images coming ‘soon’.
Note : These are LARGE downloads (10G+ in size). Download when you have good bandwidth.
- Latest version : V5
- Release date : 2017-11-10
- Download link
- For older versions see changelog
How to run it?
- You need a virtual machine ‘player’. Any of these would work:
- Download the latest sandbox image
- Double click on the ‘OVA’ file open it.
See github page for all instructions & tutorials
Checkout our Sandbox channel for more videos.
Login : student
password : bigdata123
See intro lab for a screencast.
- Use VM GUI : when you open this OVA file in a VM environment you will be logged into the Ubuntu desktop
- SSH via port 22
- from host machine
What can I run?
This VM is tested with following Big Data stack.
- Spark v1.6 and Spark v2.x
- Cassandra v3.x
- Kafka v0.10
- Storm v1.x
- Zookeeper v3.4.8
- Based on Ubuntu 16.04 LTS
- Most software is in /usr/local/apps (also ~/apps)
- Dev environment : Java / Scala
- Dev environment : Python
- Python 3.6
- Anaconda v4.3.1
- Editors :
- Eclipse Neon – ~/apps/eclipse/java-neon/eclipse/eclipse
- IntelliJ Community Edition – ~/apps/idea/bin/idea.sh
- Big Data applications supported
- Spark 2.x
- BigDL 0.3
- Cassandra 3.x
- Kafka 0.11
We welcome your feedback about the sandbox.
- send an email to email@example.com
- or open a issue at the Github page