Google Cloud for Data Scientists
Enable data scientists to ingest, analyse, visualise, and model data at scale on GCP using BigQuery, Data Studio, Dataproc, and integrated Python tooling.
Get Course Info
Audience: Data Analysts, Data Scientists
Duration: Three to four days depending on the agenda
Format: Lectures and hands-on labs. (50 %, 50 %)
Overview
Data Science is all the rage today, and Google is one of the major promoters of it. Google Cloud Platform (GCP) is one of the leading platforms for Data Science. In this course, the students will learn to do Data Science with Python, and the capabilities of Google Cloud specific to Data Science.
Objective
Enable data scientists to ingest, analyse, visualise, and model data at scale on GCP using BigQuery, Data Studio, Dataproc, and integrated Python tooling.
What You Will Learn
- Understand Google Cloud's features for Data Science
- Process of doing Data Science
- Using Google Compute Engine
- Using Google Cloud Storage
- Visualizing data using Google Data Studio
- Running SQL queries using Big Query
- Data analytics with Python
- Running Python code on Google Cloud
- Large-scale data analytics with Apache Spark
- Running Spark using Google DataProc
- Machine Learning fundamentals
- Spark ML library
- Doing ML with Spark ML on Google Cloud
Course Details
Audience: Data Analysts, Data Scientists
Duration: Three to four days depending on the agenda
Format: Lectures and hands-on labs. (50 %, 50 %)
Interest in Data Science (overview included) • Some basic Python recommended • Some programming experience is highly recommended
Setup: Zero-Install cloud lab • Modern laptop • Unrestricted Internet • Google Cloud account highly recommended
Detailed Outline
- Benefits of Cloud computing
- Google Cloud ecosystem overview
- Lab: Getting up and running in Google Cloud
- Compute Engine Intro
- Types of computing resources
- Customising a cloud VM
- Lab
- Bringing data into the cloud
- Data-storage options
- Ingesting & scheduling
- Lab
- Overview & visualising data
- Labs
- Introduction
- Running queries
- Labs
- Exploring datasets
- Cleaning, feature selection, visualisation
- Labs
- Colab, Datalab, Jupyter
- Installing packages
- Labs
- Spark Intro, DataFrames, SQL
- Labs
- Running Hadoop & Spark clusters
- Labs
- ML overview & algorithms
- Feature engineering, regressions, classifications, clustering
- Spark ML library
- CPU vs GPU benchmarking
- Labs
- Team project on a real-world DS problem using Google Cloud
Ready to Get Started?
Contact us to learn more about this course and schedule your training.