Optimizing Retail Discounts with Machine Learning

Abstract In this paper, we show how to apply machine learning to pricing and discounts. The goal is

How to prepare for the Cloudera Data Scientist Certification Exam

At our Houston Hadoop Meetup, Austin Sun showed how to prepare for the Cloudera Data Scientist Certification exam.

Processing unstructured text data with Spark 2 APIs – Dataset & Dataframe

This is part of our migrating/updating to Spark 2 series. See all our posts on Spark and Spark2.

From Spark MLLib 1.0 to Spark ML 2.1

This is part of our migrating/updating to Spark 2 series. See all our posts on Spark and Spark2.

Processing CSV files with Spark 2 – Part 1

Intro This is part of our migrating/updating to Spark 2 series. See all our posts on Spark and

Learning Scala by Example, chapter 4

Please note that I am skipping chapter 3. Chapter 3 is a “running ahead of myself” type of

Installing Tachyon (In-Memory-File-System) As A Cluster

This post shows how to setup and configure Tachyon as a cluster. Quick Pointers: Tachyon Home Page Version