An expert system for RGF operation



An expert system for RGF operation

0 0


WQTC

SLIDES

On Github APU2 / WQTC

An expert system for RGF operation

AWWA WQTC Conference 2015

Andrew Upton

Dr Peter Jarvis

Professor Bruce Jefferson

Overview

  • Definitions
  • Key messages
  • Introduction
  • Rationale
  • Example of prototype system
  • Further work

A few definitions

  • Expert system: A software system which combines a knowledge base and a reasoning mechanism to solve a problem in a specific domain
  • Machine learning: The application of generalisable algorithms which can develop data driven models to assist prediction or decision making
  • Hybrid system: A system which uses more than one branch of artificial intelligence to perform a function

Key messages

  • Many SCADA systems are good at collecting data but bad at helping us understand it.
  • Effective use of process data can inform decisions and reduce risk.
  • Process investigations can be focused on key areas, reducing cost and disruption.
  • Combining expert process knowledge and machine learning methods can provide valuable insights for treatment operation, maintenance and enhancement.

Filtration

Challenges

Turbidity

  • Filtration performance is typically monitored in terms of turbidity because it is robust and cheap
  • Turbidity measurement is subject to interferences particularly at the levels of interest.
  • The relationships between turbidity, particles, micro-organisms and pathogen risk are weak and inconsistent.
  • Comparison of turbidity values at different times or from different systems is not the same as comparing risk.

Filter performance

  • We want to maintain multiple barriers to pathogens and particles.
  • Filtrate turbidity less than 0.1 NTU is considered to indicate that a barrier is likely to be effective and the that performance is acceptable or good.
  • Filtrate turbidity greater than 0.1 NTU is considered to indicate that a barrier is likely to be less effective and that performance should be improved or is poor.
  • It is more useful and informative to understand the occurence and context of good and poor perfomance than to compare turbidity values obtainted under different conditions and at different times.

Previous work

  • Expert systems for alarm management (Dandy & Simpson 1991)
  • Improved tools for monitoring control signals (Liukkonen et al 2013)
  • Filter maintenance and operation guidance manual (Logsdon et al, 2002)
  • Partnership for safe water filtration turbidity performance scheme
  • Application of SQC for cryptosporidium control (Hall et al 2001)
  • Many applications of machine-learning for coagulant dose optimization

Aims for system

Develop an interactive software tool combining process data and expert knowledge to facilitate:

  • rapid assessment, comparison and communication of filter performance over the medium term
  • identify the most likely dominant causes of poor filtration performance over the medium term

Tools

https://www.r-project.org/

https://www.rstudio.com/

Case study:

  • SCADA data used from February 2013 to September 2014 which includes 11949 filter runs of which 644 have turbidity indicative of poor performance.
  • A filter run is considered poor if the 99th percentile turbidity of data collected at 15 minute intervals >0.1 NTU

Data processing

FILTER RUN T99<0.1NTU? PREDICTOR1 ... PREDICTORn 1 1 PASS 0.3 ... 5.1 1 2 PASS 0.4 ... 5.2 1 3 PASS 0.3 ... 5.3 2 1 PASS 0.5 ... 5.2 2 2 FAIL 0.9 ... 5.1 2 3 PASS 0.3 ... 5.2

Graphical user interface

A demo of the system can be found at:

www.andrewupton.net

Comparative performance of filters

Comparative performance of filters 2

Performance timeline

Breakdown of failure types

Cusum charts

Expert diagnosis

Identify behaviours in process signals that are indicative of specific types of fault. Aggregate the signals to best describe these behaviours over the filter run. Identify the signal behaviours which best predict poor filter performance. Link the strongest predictors of poor performance to the most likely dominant causes. Confirm with a directed investigation.

Fault tree

Diagnostic process

Classification trees

Random forest algorithm

Model performance and varaiable importance

| | 0| 1| |:--|----:|--:| |0 | 1668| 15| |1 | 35| 57|

Expert decision matrix

. Cause 1 Cause 2 Cause 3 ... Cause n Var 1 1 0 0 ... 0 Var 2 0.3 0.3 0.3 ... 0 . . . . ... . . . . . ... . . . . . ... . Var n 0 0 0.5 ... 0.5

Suggested causes

Next steps

  • Shift from using whole runs to periods within runs
  • Use of higher frequency data
  • Full scale online trial system
  • Adaptation to multiple sites with different treatment processes
  • Greater specificity in fault diacnosis
  • Automatic suggestion of follow up actions

Summary

  • The value of process data often not realised.
  • Providing improved data analysis tools to the operators and engineers who are most familiar with the treatement assets could facilitate improved compliance and reduced risk

References

  • Dandy, G. C., & Simpson, A. R. (1991). Development of expert systems for a water filtration plant. Civil Engineering Systems, 8, 63–70.
  • Logsdon, G., Hess, A., Chipps, M. J., & Rachwal, A. (2002). Filter Maintenance and Operations Guidance Manual. American Water Works Association.
  • Liukkonen, M., Juntunen, P., Laakso, I., & Hiltunen, Y. (2013). A software platform for process monitoring: Applications to water treatment. Expert Systems with Applications, 40(7), 2631–2639.
  • Hall, T., Realey, G., & Watts, M. (2001). APPLICATION OF STATISTICAL PROCESS CONTROL IN WATER TREATMENT FOR MANAGING CRYPTOSPORIDIUM RISK Report Ref. No. 00/DW/06/15.

R Packages

- Ramnath Vaidyanathan (2012). slidify: Generate reproducible html5 slides from R markdown. R package version 0.5. http://ramnathv.github.com/slidify/ - M Dowle, T Short, S Lianoglou, A Srinivasan with contributions from R Saporta and E Antonyan (2014). data.table: Extension of data.frame. R package version 1.9.4. - Garrett Grolemund, Hadley Wickham (2011). Dates and Times Made Easy with lubridate. Journal of Statistical Software, 40(3), 1-25. - Hadley Wickham (2011). The Split-Apply-Combine Strategy for Data Analysis. Journal of Statistical Software, 40(1), 1-29. - Hadley Wickham and Romain Francois (2015). dplyr: A Grammar of Data Manipulation. R package version 0.4.3. - Hadley Wickham (2007). Reshaping Data with the reshape Package. Journal of Statistical Software, 21(12), 1-20. - Achim Zeileis and Gabor Grothendieck (2005). zoo: S3 Infrastructure for Regular and Irregular Time Series. Journal of Statistical Software, 14(6), 1-27. - Winston Chang, Joe Cheng, JJ Allaire, Yihui Xie and Jonathan McPherson (2015). shiny: Web Application Framework for R. R package version 0.12.2. - Winston Chang (2015). shinydashboard: Create Dashboards with 'Shiny'. R package version 0.5.1. - Ramnath Vaidyanathan (2013). rCharts: Interactive Charts using Javascript Visualization Libraries. R package version 0.4.5.

R packages cont

Scrucca, L. (2004). qcc: an R package for quality control charting and statistical process control. R News 4/1, 11-17. Markus Gesmann and Diego de Castillo. Using the Google Visualisation API with R. The R Journal, 3(2):40-44, December 2011. Winston Chang and Hadley Wickham (2015). ggvis: Interactive Grammar of Graphics. R package version 0.4.2. Lawrence M, Huber W, Pages H, Aboyoun P, Carlson M, et al. (2013) Software for Computing and Annotating Genomic Ranges. PLoS Comput Biol 9(8) Dan Vanderkam and JJ Allaire (2015). dygraphs: Interface to Dygraphs Interactive Time Series Charting Library. R package version 0.4.5. Stephen Milborrow (2015). rpart.plot: Plot rpart Models. An Enhanced Version of plot.rpart. R package version 1.5.2. Brian Ripley (2015). tree: Classification and Regression Trees. R package version 1.0-36. A. Liaw and M. Wiener (2002). Classification and Regression by randomForest. R News 2(3), 18--22.

Many thanks to..