Big Data Analytics With Hadoop
Introduce analysts to the Hadoop ecosystem and its analytics capabilities.
Get Course Info
Audience: Business Analysts, Developers
Duration: 2 days
Format: Lectures and hands-on labs.
Overview
Apache Hadoop is a popular framework for processing Big Data. Hadoop provides rich and deep analytics capability, and it is making in-roads into the traditional BI analytics world. This course will introduce an analyst to the core components of the Hadoop ecosystem and its analytics.
Objective
Introduce analysts to the Hadoop ecosystem and its analytics capabilities.
What You Will Learn
- Understanding Hadoop ecosystem
- Data storage using HDFS
- Data warehousing and querying using Hive
Course Details
Audience: Business Analysts, Developers
Duration: 2 days
Format: Lectures and hands-on labs.
Programming background with databases / SQL; basic knowledge of Linux
Setup: Zero Install cloud cluster • SSH client • Browser
Detailed Outline
- Hadoop overview
- Distributions
- High level architecture
- Hardware / software
- Labs : first look at Hadoop
- Concepts (horizontal scaling, replication, data locality)
- Architecture (Namenode, Data node)
- Demo : Interacting with HDFS
- YARN operating system
- Demo : Running applications on YARN program
- Hive concepts & architecture
- SQL support in Hive
- Data warehousing in Hive
- Data types
- Table creation and queries
- Partitions
- Joins
- Modern data formats
- Text analytics
- Hive performance
- Labs (multiple)
Ready to Get Started?
Contact us to learn more about this course and schedule your training.