Machine Learning with Sagemaker (AWS)
Overview
Machine Learning (ML) is the killer app for Big Data. Amazon Machine Learning brings the power of ML to a regular programmer and provides ML as a service. However, to use ML effectively, one needs to understand the models used and how to utilize them on Amazon.
This course is intended for data scientists and software engineers. It maintains an optimal balance of theory and practice. For each machine learning concept, we first discuss the foundations, its applicability, and limitations. Then we explain the implementation and use, and specific use cases. This is achieved through a combination of about 50% lecture, 50% lab work.
Amazon SageMaker is a fully managed machine learning service. The course combines overview and understanding of Machine Learning concepts with specific implementation in SageMaker. In addition, it brings in other tools outside of SageMaker when required.
Audience
Data Scientists and Software Engineers
Duration
3 days
Prerequisites
- familiarity with programming in at least one language
- be able to navigate Linux command line
- basic knowledge of command-line Linux editors (VI / nano)
- basic familiarity with AWS (optionally may be provided in the first day on the course)
Lab environment
Training Amazon account will be provided. Students would only need an SSH client and a browser. Zero Install: There is no need to install software on students’ machines.
Objectives
- attain a thorough understanding of popular machine learning algorithms, their applicability, and limitations
- practice the application of these methods in the Amazon machine learning environment
- achieve clarity in the real-world use of machine learning by illustrating each method with practical use cases
Course Outline
Introductions and overviews
-
- Data ETL
- Go into one example in detail, implemented on AWS Redshift
- Provide a pointer to other examples for self-study
- Machine learning
- Goals, results, supervised/unsupervised
- Which part of ML is implemented in the Amazon Machine Learning
- SageMaker (AWS) Overview
- Data ETL
Supervised Learning
-
- Linear regression
- Logistic regression and multinomial logistic regression
- SVM, decision trees, random forests, neural networks
- Labs for every section above
Unsupervised learning
-
- K-Means
- Other types of unsupervised learning
- Hierarchical clustering
- Mixture models
- DBSCAN
Data visualization
-
- Visualization examples for the models above
- Links to other visualizations for self-study
SageMaker
-
- Intro
- SageMaker Details
- Using Built-in Algorithms
- Using Your Own Algorithms
- Using TensorFlow
- Using Apache MXNet
- Using Apache Spark
- Amazon SageMaker Libraries
- Authentication and Access Control
- Monitoring