Author Archives: Tim Fox

From Spark MLLib 1.0 to Spark ML 2.1

This is part of our migrating/updating to Spark 2 series. See all our posts on Spark and Spark2. Code repository Learning Spark @ Github Screencast   Spark’s Machine Learning (ML) components have changed significantly.  Just like the rest of Spark, the older RDD-based API persists with the newer dataframe based API. Yet, I find that the […]

From Scala to Python in Spark

Scala Vs. Python Spark’s native language is Scala, a fine language, but in many ways Spark seems more popular than Scala.   I’m often asked why Spark’s creators chose Scala.  Given that the Spark framework runs on the  JVM, that really limited the choices of language to venerable Java or new-kid-on-the-block Scala.   As Spark’s […]