Designing Concurrent Systems with Apache Zookeeper

Overview

This course is unique in that it teaches the best practices and caveats of designing modern concurrent Big Data systems. ZooKeeper is an ideal tool to understand and practice the theory, and to reason about system performance, fault tolerance, and stability.

ZooKeeper is the defacto standard for coordinating multiple components in distributed systems. In this class, we will learn ZooKeeper architecture, design, and implementation. Then we will go through the standard ZooKeeper design patterns and their implementation.

In recent years, most of the design work with ZooKeeper is done through Curator. The curator makes the implementation of the design patterns – called recipes – much easier and more robust. We will work with Elections (such as Leader Latch and Leader Election), Locks, Barriers, and more.

The course includes a balance of theory and lab work.

Duration

3 days

Audience

Developers, administrators, architects.

Prerequisites

Experience and background in software development and administration

Lab environment

Amazon EC2 servers will be provided students for installation, administration, and lab work.  Students would need an SSH client and a browser to access the cluster.

Zero Install: There is no need to install Solr software on students’ machines! (although it is possible)

Course Contents

ZooKeeper fundamentals

  • Distribute the coordination system
  • Design goals and results
  • Common coordination tasks

ZooKeeper Java and C API

  • Goals and capabilities
  • Differences, pros, and cons
  • Labs

ZooKeeper environment

  • Track and react to ZooKeeper changes
  • Handling failures (network, apps)
  • Concurrency issues

Curator and Exhibitor

  • Goals and design
  • Installation and configuration
  • Advantages and current trends

Curator recipes and use cases

  • Elections
  • Locks
  • Barriers
  • Counters
  • Caches
  • Nodes
  • Queues
  • Centralized initialization

ZooKeeper internals

  • Internals
  • Administration