Hadoop for Business Analysts

Looking for team training?

We offer excellent trainer-led courses.


Hadoop For Business Analysts


Apache Hadoop is the most popular framework for processing Big Data. Hadoop provides rich and deep analytics capability, and it is making in-roads in to traditional BI analytics world. This course will introduce an analyst to the core components of Hadoop eco system and its analytics

What You Will Learn:

  • Understanding Hadoop ecosystem
  • Data storage using HDFS
  • ETL using Pig
  • Data warehousing and querying using Hive


Business Analysts


three days


Lectures and hands on labs.


  • programming background with databases / SQL
  • basic knowledge of Linux (be able to navigate Linux command line, editing files with vi / nano)

Lab environment

Zero Install : There is no need to install hadoop software on students’ machines! A working Hadoop cluster will be provided for students.

Students will need the following


Detailed outline

  • Section 1: Quick primer on Hadoop / HDFS / MapReduce
    • Hadoop eco system
      • distributions
      • high level architecture
      • hardware / software
      • Labs : first look at Hadoop
    • HDFS Overview
      • concepts (horizontal scaling, replication, data locality)
      • architecture (Namenode,  Data node)
      • Demo : Interacting with HDFS
    • Map Reduce Overview
      • mapreduce concepts
      • YARN operating system
      • Demo : Running a Map Reduce program
  • Section 2: Hive
    • hive concepts & architecture
    • SQL support in Hive
    • Data warehousing in Hive
    • data types
    • table creation and queries
    • partitions
    • joins
    • text analytics
    • labs (multiple) : creating Hive tables and running queries, joins , using partitions, using text analytics functions
  • Section 3 : Pig
    • pig concepts and architecture
    • pig latin language
    • understanding pig job flow
    • basic data analysis with Pig
    • data cleanup
    • ETL workloads with Pig
    • joins and multi datasets with Pig
    • user defined functions
    • debugging Pig scripts
    • lab : writing pig scripts to analyze / transform data
  • Section 4: BI Tools for Hadoop
    • BI tools and Hadoop
    • Overview of current BI tools landscape
    • Choosing the best tool for the job