Anomaly Detection Session 5: Multivariate Auto Regressive Data Anomaly detection on Spark/Scala

December 4, 2020
Start Time
9:00 AM PST
End Time
10:00 AM PST


This is part of Machine Learning-Driven Anomaly Detection

What you will learn

It can also be described as multi-variate time series data. This kind of data has temporal as well as spatial correlation between the different variables.  They are the most complex among the 4 categories of data. Most of the good solutions are based on deep learning.

We will review a  set of deep learning papers highlighting their salient points for anomaly detection solution. Most of the solution is based on a combination of a recurrent network (RNN) and an autoencoder network (AE). RNN handles temporal correlation and AE handles spatial correlation.

  1. Paper: “Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network” by  Su, Zhao
  2. Paper: “Multivariate Industrial Time Series with Cyber-Attack Simulation: Fault Detection Using an LSTM-based Predictive Data Model” by  Filonov, Lavrentyev
  3. Paper: “LSTM-based Encoder-Decoder for Multi-sensor Anomaly Detection” by Malhotra, Ramakrishnan
  4. Paper: “A Multimodal Anomaly Detector for Robot-Assisted Feeding Using an LSTM-based Variational Autoencoder” by Park, Hoshi

This is a FREE class!


Can’t make it to live session?  No worries.  Go ahead and register; we will send you the session recording.
See below for past session recordings & notes

Intended Audience

COO, CIO, DevOps, Software Engineers


  • Must have: Development experience
  • Nice to have: Python knowledge

What to Bring

  • Please bring a reasonably modern laptop (Corporate laptops with overly restrictive firewalls may not work well;  Personal laptops are recommended)
  • [nice to have] download our docker image elephantscale/es-training
  • Class Notes here

Session Recording

Class Notes

Git Repo

The python implementation is available in the open source  project avenir in GitHub


Pranab Ghosh

Pranab Ghosh is a Data Science Consultant, He owns several open-source Big Data and Data Science projects using Hadoop, Spark, Storm, Kafka, NoSQL databases, and the related ecosystem.