Skip to course content

Big Data Analytics With Hadoop

Introduce analysts to the Hadoop ecosystem and its analytics capabilities.

Get Course Info

Audience: Business Analysts, Developers

Duration: 2 days

Format: Lectures and hands-on labs.

Overview

Apache Hadoop is a popular framework for processing Big Data. Hadoop provides rich and deep analytics capability, and it is making in-roads into the traditional BI analytics world. This course will introduce an analyst to the core components of the Hadoop ecosystem and its analytics.

Objective

Introduce analysts to the Hadoop ecosystem and its analytics capabilities.

What You Will Learn

  • Understanding Hadoop ecosystem
  • Data storage using HDFS
  • Data warehousing and querying using Hive

Course Details

Audience: Business Analysts, Developers

Duration: 2 days

Format: Lectures and hands-on labs.

Prerequisites:

Programming background with databases / SQL; basic knowledge of Linux

Setup: Zero Install cloud cluster • SSH client • Browser

Detailed Outline

  • Hadoop overview
  • Distributions
  • High level architecture
  • Hardware / software
  • Labs : first look at Hadoop
  • Concepts (horizontal scaling, replication, data locality)
  • Architecture (Namenode, Data node)
  • Demo : Interacting with HDFS
  • YARN operating system
  • Demo : Running applications on YARN program
  • Hive concepts & architecture
  • SQL support in Hive
  • Data warehousing in Hive
  • Data types
  • Table creation and queries
  • Partitions
  • Joins
  • Modern data formats
  • Text analytics
  • Hive performance
  • Labs (multiple)

Ready to Get Started?

Contact us to learn more about this course and schedule your training.