Hacking Hadoop – 3

Another idea (which came about in a conversation with a security guardian at a company where I was teaching). Create millions of entries in the HDFS, and this will take it out of services. A few ways to do it: Write many small files with a script. Partition a Hive table on a key that […]

This entry was posted in Hacking.

Hacking Hadoop – 2

A while back, we described hacking Hadoop through the Cloudera Manager (CM) or through Ambari. But there is so much more to hack! Here is what I would do if I had a chance (this is just the first approximation of the list, comments are welcome). Hacking through CM or Ambari Try default passwords admin/admin Try […]

This entry was posted in Hacking.

A Unique Proposal

Today, many companies want to create their own custom training content, in their own format. We at Elephant Scale are experts at this. We created all of our own content – which we regularly use for training, and receive very good feedback – and we can help you create yours! For many demanding clients, we […]

How to prepare for the Cloudera Data Scientist Certification Exam

At our Houston Hadoop Meetup, Austin Sun showed how to prepare for the Cloudera Data Scientist Certification exam. Austin has prepared for this presentation for quite a while, passed the certification himself, and now shared his experience with others. The certification is definitely recommended by Sujee Maniyam in his “Launching Your Career in Big Data” […]

IBM Strategy for Spark

Last month, Garrett Young of IBM presented at our Houston Hadoop & Spark Meetup. The topic was an interesting one: how is IBM planning to make money on an open source project, in that case, Spark. First, Garrett briefly introduced Spark and spelled out the reasons for IBM’s interest in Spark: it is performant, productive, […]