Machine Learning with Python(live bootcamp)

March 2, 2020
Start Time
End Time


By Popular Demand, we are providing a Live Virtual Bootcamp course:

Machine Learning with Python Bootcamp


Machine Learning (ML) is changing the world. To use ML effectively, one needs to understand the algorithms and how to utilize them. This course provides an introduction into the most popular machine learning algorithms.

This course teaches doing Machine Learning using popular SciKit-Learn package in Python language.

This course teaches Machine Learning from a practical perspective. In-depth coverage of Math / Stats is beyond the scope of this course.

**Live Online Only Duration: 4 Days / 4 hours per day, evening hours (6pm to 10pm PST)**

What you will learn:

  • Python and SciKit-Learn
  • ML Concepts
  • Regressions

– Linear Regression

– Logistic Regressions

  • Classifications

– Naive Bayes


  • Clustering algorithms (K-Means)


Data analysts, Software Engineers, Data scientists


Two Days

Skill Level:

Beginner to Intermediate

Industry Use Cases Covered

We will study and solve some of most common industry use cases; listed below

  • Finance

– Predicting house prices

– Predicting loan defaults at Prosper

  • Health care

– Predicting diabetes outcome

  • Customer service

– Predicting customer turnover

  • Text analytics

– Spam classification

  • Travel

– Predicting Uber demand

  • Other

– Predicting college admissions


  • Good programming background
  • familiarity with Python would be a plus, but not required
  • No machine learning knowledge is assumed

Lab environment

Students will need to bring a laptop with python development environment setup

Students will need the following

  • A reasonably modern laptop with unrestricted connection to the Internet. Laptops with overly restrictive VPNs or firewalls may not work properly
  • Chrome browser
  • Introduction to Python programming environment
  • Introduction to Numpy and Pandas
  • Labs

Detailed Course Outline

Python Basics

– Working with Jupyter notebooks

– Numpy and Pandas

Machine Learning (ML) Overview

  • Machine Learning landscape
  • Understanding Deep Learning use cases
  • Understanding AI / Machine Learning / Deep Learning
  • Data and AI
  • AI vocabulary
  • Hardware and software ecosystem
  • Understanding types of Machine Learning (Supervised / Unsupervised / Reinforcement)
  • Scikit-Learn library overview
  • Lab:

Python Scikit-Learn Library

– Scikit-Learn utilities

Feature Engineering and Exploratory Data Analysis (EDA)

  • Preparing data for ML
  • Statistics Primer
  • Data cleanup
  • Extracting features, enhancing data
  • Visualizing Data
  • Labs:

– Data cleanup

– Exploring data

– Visualizing data

Machine Learning Concepts

  • Training and Testing
  • Gradient Descent
  • Overfitting / Under-fitting
  • Cross validation, bootstrapping
  • Confusion Matrix
  • ROC curve, Area Under Curve (AUC)
  • Linear Regression
  • Errors, Residuals
  • Multiple Linear Regression
  • Evaluating model performance
  • Labs:

Linear regression

– Use case: House price estimates

Logistic Regression

  • Understanding Logistic Regression
  • Calculating Logistic Regression
  • Evaluating model performance
  • Labs:

– Credit card application

– college admissions

Classification: SVM (Supervised Vector Machines)

  • SVM concepts and theory
  • SVM with kernel
  • Labs: -Customer churn data
  • Naive Bayes theory
  • Running Naive Bayes algorithm
  • Evaluating model performance
  • Lab

Classification: Naive Bayes

– Spam filtering

Unsupervised Algorithms

  • Overview of unsupervised algorithms
  • Supervised vs. unsupervised
  • Understanding unsupervised algorithms
  • Theory behind K-Means
  • Running K-Means algorithm
  • Estimating the performance
  • Labs:

Unsupervised: Clustering: K-Means

– Predicting Uber demand

– Clustering shopping trips

Final workshop (time permitting)

  • This is a group workshop
  • Each group will analyze a couple of real world datasets and run ML algorithms
  • Each group will present their findings to the class

Abou the Instructor:

Sujee Maniyam is a seasoned practitioner and founder of Elephant Scale. He teaches and consults in AI (machine learning and deep learning) and Big Data technologies (Hadoop, Spark, NoSQL and Cloud). He is an open source contributor and author of ‘Hadoop illuminated’ (an open-source book on Hadoop) and ‘HBase Design Patterns’. Sujee is a frequent speaker at various conferences and meetups. He also advises and mentors various firms.