Sandbox Running Spark

This video illustrates how to run Spark using Sandbox

Introducing BigDL: Distributed Deep Learning on Spark

Distributed Deep Learning is not easy.    While both Tensorflow and Caffe have distributed modes, they tend to

IBM Strategy for Spark

Last month, Garrett Young of IBM presented at our Houston Hadoop & Spark Meetup. The topic was an

Processing unstructured text data with Spark 2 APIs – Dataset & Dataframe

This is part of our migrating/updating to Spark 2 series. See all our posts on Spark and Spark2.

From Spark MLLib 1.0 to Spark ML 2.1

This is part of our migrating/updating to Spark 2 series. See all our posts on Spark and Spark2.

Processing CSV files with Spark 2 – Part 1

Intro This is part of our migrating/updating to Spark 2 series. See all our posts on Spark and

From Scala to Python in Spark

Scala Vs. Python Spark’s native language is Scala, a fine language, but in many ways Spark seems more

Review of “Spark in Action” by Petar Zecevic and Marko Bonaci (Manning)

“Spark in Action” has the standard Manning structure: it has four parts, “First Steps,” “Meet the Spark Family,”

Spark

  Introduction to Big Data You Will Learn How To: Integrate Big Data components to create an appropriate

Review of “Learning Spark” by Karau, Konwinski, Wendell & Zaharia

  “Learning Spark” was the first published book on the subject. Six months later, there appeared a plethora