CALL NOW 713-568-9753
Code

Here is the collection of our open source code libraries

HI-labs

Open source labs to learn Hadoop
github.com/elephantscale/HI-labs

HBase Labs

Labs bundle for our ‘HBase Design Patterns’ book
github.com/elephantscale/hbase-book

Learning Scala

Few tutorials to get you started on  Scala.
github.com/elephantscale/learning-scala

 

‘Hadoop illuminated’ Hadoop Book

Code & content for our open source ‘Hadoop illuminated’ book.
github.com/elephantscale/hadoop-book

 


Blog Posts on Code

07 Aug 2017
Abstract In this paper, we show how to apply machine learning to pricing and discounts. The goal is to create the optimal discounting strategy, which…
21 Jul 2017
Earlier today I saw an image from this tweet: I thought I saw a few problems with it. So I decided to give the readers…
02 Jul 2017
At our Houston Hadoop Meetup, Austin Sun showed how to prepare for the Cloudera Data Scientist Certification exam. Austin has prepared for this presentation for…
27 Jun 2017
At ElephantScale, we practice what we preach. Here is one example. Our FreeEed project is open source eDiscovery, and it is popular with lawyers, legal…
22 Jun 2017
This is part of our migrating/updating to Spark 2 series. See all our posts on Spark and Spark2. This post explains how to process unstructured,…
22 Jun 2017
Motivation Spark is an amazing computing framework.  Spark version 2 has lots of exciting stuff.  And Hadoop vendors Cloudera and Hortonworks are now supporting Spark…
08 Jun 2017
Machine Learning and Artificial Intelligence (AI) are certainly all the rage today. AI will touch everyone, it will change lives and careers in a matter…
30 May 2017
This is part of our migrating/updating to Spark 2 series. See all our posts on Spark and Spark2. Code repository Learning Spark @ Github Screencast…
04 May 2017
Intro This is part of our migrating/updating to Spark 2 series. See all our posts on Spark and Spark2. In this post, we are going…
20 Mar 2017
Scala Vs. Python Spark’s native language is Scala, a fine language, but in many ways Spark seems more popular than Scala.   I’m often asked…
12 May 2016
When my friend and co-founder Sujee Maniyam presented his “Launching Your Career in Big Data” at SNIA, it became an immediate hit. I mention it…
11 Oct 2015
Please note that I am skipping chapter 3. Chapter 3 is a “running ahead of myself” type of chapter. It shows examples of what is…
09 Oct 2015
How do you hack Hadoop? Here is what we did. We took our team hacker (whom we will call Mr. Hacker, because he prefers it…
19 Aug 2015
This post shows how to setup and configure Tachyon as a cluster. Quick Pointers: Tachyon Home Page Version 0.64 documentation Single Node Install Guide We…
30 Jul 2015
In the previous, first installment, I explained why it is worth your while to learn Scala. Now I want to introduce you to chapter 2…
20 Jul 2015
This is an inaugural post in the series of “Learning Scala by Example,” which is a play on words. I mean learning the book titled…
11 Jul 2015
(Disclaimer : This is not an official post from Databricks) Spark Summit 2015 in San Francisco was well attended.  Kudos to the Databricks team for organizing this fantastic…
02 May 2015
Spark is your friend? – We’ll see about that. Spark likes to pretend that it is your friend. For example, it is a friend of…
02 May 2015
In my last installment I described how Microsoft missed its chance to be the leader in the Big Data. Why? Why was Dryad killed, but…
02 May 2015
Microsoft’s relationship with Hadoop was for a long time ambiguous: from a rumor about “Hadoop on Azure” (back in 2008) to “never!”  to “We will…
22 Jan 2015
Spark is great for cached data.  Take a look here to understand various caching options for spark. Read More
12 May 2014
Just look at all the wonderful students. We got to teach one day of the Global Big Data Conference. By now, with our experience of…
09 Apr 2014
Elephant Scale, a provider of Big Data training, implementations, and vertical Hadoop product applications, is pleased to announce that it has successfully completed the first…
31 Mar 2014
What was so amazing about our March 28-30 bootcamp? A number of things: We collected more than twenty students altogether (with some remotes and some rescheduling).…
07 Mar 2014
Why is Houston special? There is very little of Big Data going in Houston now, and many tried but failed to have a course here.…
02 Feb 2014
Abstract In this paper we discuss best practices and real world testing strategies for Big Data, Hadoop, and NoSQL. The subjects of testing and software…
22 Jan 2014
This step-by-step guide walks through installing a Hadoop 2 on a single node. We use TAR files. This is ideal for setting up a development…
11 Dec 2013
If you are building a Java project that has Hadoop or HBase dependency  (for example a Java mapreduce application), here is a simple POM.xml to…
09 Dec 2013
Hadoop has hundreds of configurable parameters. Â And Hadoop admins and developers spend a lot of time tweaking these settings. I wanted a quick way…