Category Archives: Uncategorized

IBM Strategy for Spark

Last month, Garrett Young of IBM presented at our Houston Hadoop & Spark Meetup. The topic was an interesting one: how is IBM planning to make money on an open source project, in that case, Spark. First, Garrett briefly introduced Spark and spelled out the reasons for IBM’s interest in Spark: it is performant, productive, […]

Teaching Big Data in Bentonville

Every time I go and work or teach at a company whose name nobody knows (this beginning can be understood in the key of “En un lugar de la Mancha, de cuyo nombre no quiero acordarme, no ha mucho tiempo que vivía un hidalgo”) – I always visit the local museum of American art. But those […]


Introduction to Big Data You Will Learn How To: Integrate Big Data components to create an appropriate Data Lake Select the correct Big Data stores for disparate data sets Process large data sets using Hadoop to extract value Query large data sets in near real time with Pig and Hive Plan and implement a Big […]

The Power of Text Analytics at DARPA/Memex

Elephant Scale is proud to be part of the DARPA Memex team. One of the things we are focusing on in the DARPA Memex program is text analytics. One of the outcomes of it is an open source project called MemexGATE. By itself, GATE stands for Generic Architecture for Text Engineering, and it is a mature and […]

Attending the Hadoop Summit? Meet with Sujee Maniyam, author of Hadoop Illuminated

The 8th Annual Hadoop Summit, the leading conference for the Apache Hadoop community, is quickly approaching. This great event features numerous Apache Hadoop thought leaders who will showcase successful Hadoop use cases, share development and administration tips and tricks, and educate organizations about how best to leverage Apache Hadoop as a key component in their […]

Hadoop in Cloud — Plenty of Choices

Want to run Hadoop in the cloud? Good news is, right about now you have some pretty good choices from major cloud providers. 1) Amazon Cloud Amazon had an on-demand Hadoop offering, called Amazon Elastic Map Reduce (EMR) for a while.  It is the oldest offering of Hadoop in Cloud. Amazon Elastic Map Reduce Amazon […]