We had a great webinar!
Thanks for all the interest and all your questions!
And here are the slides:
Q & A
Q : Does Spark only work with small data sets?
A : Spark can work with large datasets just as well. If you have small data sets (GBs) Spark works better than Hadoop.
Q : Do I need to load all data in memory to use Spark?
A : No. Spark can access data on disk or in memory. Not all data has to fit in memory.
How ever, if data fits in memory, Spark can process the data in memory (super fast!)
Q : I am currently using Hadoop. Can I try Spark side by side?
A : Yes. The easiest way is to use your distribution’s installer to install Spark. Latest versions of Cloudera and Hortonworks support installing Spark easily.
Q : I already have data in Hive warehouse. Do I need to ‘re-import’ data into Spark to use Spark SQL?
A : No. Spark can directly query data in Hive datamart.
Original Webinar Annoucemnet
Apache Spark has become a very popular Big Data technology. Is Spark replacing Hadoop? If you are running Hadoop should you think about moving to Spark?
When does using Spark or Hadoop make sense?
In this webinar, we will discuss the features of both Spark and Hadoop . We will also high light some use cases where each technology is appropriate.