10th IFAC Symposium

on Advanced Control of Chemical Processes

July 25 - 27, 2018 Shenyang, China


Advanced Tools for Process Data Analytics


Bhushan Gopaluni

Yiting Tsai

Aditya Tulsyan

Lee Rippon

Guest Speakers

Prof. Sirish L. Shah (University of Alberta)

Prof. Richard Braatz (Massachusetts Institute of Technology)

Workshop Introduction

We are currently at the cusp of what is considered the fourth industrial revolution. This revolution is driven by the ubiquitous cyber-physical systems, algorithmic developments in artificial intelligence, gargantuan computing power, inexpensive memory and the gigantic volumes of data that are being collected. The process industries are in possession of treasure troves of heterogenous data that is gravely under utilized. The competitive global environment, and the ever increasing demands on energy, environment and quality are subjecting these industries to a high level of economic pressure. The incredible volumes of data that they already possess are poised to provide a level of automation and efficiency never seen before and thus alleviate the economic and competitive pressures. Process industries have been using data analytics in various forms for more than three decades. In particular, statistical techniques such as principal component analysis (PCA), partial least squares (PLS), canonical variate analysis (CVA) and time series methods for modeling such as maximum likelihood estimation, prediction error methods have been extensively applied on industrial data. The recent developments in machine learning and artificial intelligence provide a new opening for using process data on large scale problems. However, in order to successfully apply machine learning methods to process data, researchers require not only a high level understanding of the algorithms but also strong programming knowledge in packages such as Python, TensorFlow, Keras and Jupyter. This workshop will introduce the essential machine learning algorithms and software tools for graduate students, experienced researchers and engineers working in the industry. Elementary knowledge of probability and statistics is required to attend this workshop. The workshop will also feature at least two confirmed guest speakers with decades of experience in data analytics. Prof. Sirish L. Shah of University of Alberta and Prof. Richard D. Braatz of Massachusetts Institute of Technology have both agreed to speak during the workshop. We have also invited an industrial guest speaker.

Course Plan and Contents

This full day (approximately seven and half hours) workshop will be organized at the ADCHEM conference. Starting with an elementary introduction to statistics and probability, we will develop various regression, classification, dimensionality reduction and advanced learning algorithms that are of interest to engineers. In addition various widely used machine learning software packages will be introduced. Registrants will solve exercises and receive take away software code to implement these algorithms.

1. Basics of probability and statistics, underfitting, overfitting and bias-variance tradeoff

2. Classification Algorithms
• k-nearest neighbours algorithm
• k-means algorithm
• Support Vector Machines
• Naive Bayes Classifier and Decision Trees

3. Regression Algorithms
• Linear Least Squares
• Non-linear Least Squares
• Kernel Regression

4. Dimensionality Reduction Algorithms
• Principal Component Analysis (PCA)
• Partial Least Squares (PLS) • Isometric Mapping (ISOMAP)
• Local Linear Embedding (LLE)
• Canonical Correlation Analysis (CCA)
• Multidimensional Scaling (MDS)

5. Advanced Learning Algorithms
• Artifical Neural Networks
• Deep Learning
• Gaussian Processes
• Bayesian Networks
• Deep Reinforcement Learning

6. Applications in the Process Industry

Learning Outcomes

By the end of this workshop, registrants will be able to

  • identify and solve classification, regression and dimensionality reduction problems.
  • work with softwares such as TensorFlow, Keras and Python.