Our classes are now delivered LIVE ONLINE. Keep learning! Find out More

How to prepare for the Cloudera Data Scientist Certification Exam

At our Houston Hadoop Meetup, Austin Sun showed how to prepare for the Cloudera Data Scientist Certification exam.

Continuous builds for open source projects

At ElephantScale, we practice what we preach. Here is one example. Our FreeEed project is open source eDiscovery,

IBM Strategy for Spark

Last month, Garrett Young of IBM presented at our Houston Hadoop & Spark Meetup. The topic was an

Processing unstructured text data with Spark 2 APIs – Dataset & Dataframe

This is part of our migrating/updating to Spark 2 series. See all our posts on Spark and Spark2.

MOOC or in-class training?

In his article, “How the pioneers of MOOC got it all wrong,” Robert Ubell quotes surprising statistics. Today,

Lawyers and Machine Learning – How a Little Learning Goes a Long Way

  Machine Learning and Artificial Intelligence (AI) are certainly all the rage today. AI will touch everyone, it

From Spark MLLib 1.0 to Spark ML 2.1

This is part of our migrating/updating to Spark 2 series. See all our posts on Spark and Spark2.

Processing CSV files with Spark 2 – Part 1

Intro This is part of our migrating/updating to Spark 2 series. See all our posts on Spark and

On learning languages, machine translation, and neural networks

If you are into learning languages, take a look at the Glossika startup. I believe it offers the

From Scala to Python in Spark

Scala Vs. Python Spark’s native language is Scala, a fine language, but in many ways Spark seems more