Dynamic Data Introduction

Main.DynamicData History

Show minor edits - Show changes to output

February 13, 2023, at 04:15 PM by 10.35.117.248 -
Deleted lines 11-22:

'''Data Preparation for Dynamic Optimization'''

Data preparation is important for dynamic optimization in order to set up simulations that utilize time-varying information. Several aspects of dynamic optimization involve the import, validation, filtering, manipulation, and display of large data sets. Select one of the following tutorials below on using MATLAB or Python to import, manipulate, and export data sets.

(:html:)
<iframe width="560" height="315" src="https://www.youtube.com/embed/E56egH10RJA" frameborder="0" allowfullscreen></iframe>
(:htmlend:)

(:html:)
<iframe width="560" height="315" src="https://www.youtube.com/embed/Tq6rCWPdXoQ" frameborder="0" allowfullscreen></iframe>
(:htmlend:)
February 13, 2023, at 04:14 PM by 10.35.117.248 -
Added lines 5-6:
'''Machine Learning Overview'''
Added lines 9-10:
'''Data Engineering Overview'''
Changed lines 13-15 from:
Data manipulation is important for dynamic optimization in order to set up simulations that utilize time-varying information. Several aspects of dynamic optimization involve the import, validation, filtering, manipulation, and display of large data sets. Select one of the following tutorials below on using MATLAB or Python to import, manipulate, and export data sets.
to:
'''Data Preparation for Dynamic Optimization'''

Data preparation
is important for dynamic optimization in order to set up simulations that utilize time-varying information. Several aspects of dynamic optimization involve the import, validation, filtering, manipulation, and display of large data sets. Select one of the following tutorials below on using MATLAB or Python to import, manipulate, and export data sets.
February 13, 2023, at 04:13 PM by 10.35.117.248 -
Added lines 4-7:

Machine learning (ML) focuses on the development of algorithms and statistical models that enable computers to learn and improve from experience without being explicitly programmed. ML enables computers to automatically identify patterns and relationships in data, make predictions and take actions based on that data. See the [[https://apmonitor.com/pds|Machine Learning for Engineers]] course for more information about ML methods including classification and regression.

Data engineering refers to the process of preparing and structuring the data so that it can be used as input for ML training and predictions. This includes tasks such as data extraction, data cleaning, data transformation, data integration, and data storage. Effective data engineering is crucial for the success of machine learning projects because the quality of the data determines the accuracy and performance of the models. The role of data engineers in machine learning projects is to work closely with data scientists and domain experts to ensure that the data is properly collected, stored, and processed to support the development and deployment of machine learning models. They design and implement data pipelines to automate the flow of data from various sources to the machine learning models, and they also manage and scale the infrastructure required to store and process large amounts of data. Machine learning and data engineering are closely interconnected fields that require a combination of technical skills and domain knowledge. Data engineering provides the foundation for machine learning, enabling it to make accurate predictions and provide valuable insights based on the data. See the [[https://apmonitor.com/dde|Data-Driven Engineering]] course for more information about data collection and preparation.
November 17, 2021, at 12:49 AM by 10.35.117.248 -
Changed lines 170-174 from:
plt.plot(time,z,'kx',linewidth=2)
plt.plot(time,xb,'g--',linewidth=3)
plt.plot(time,x2mhe,'k-',linewidth=3)
plt.plot(time,x1mhe,'r.-',linewidth=3)
plt.plot(time,xtrue,'k:',linewidth=2)
to:
plt.plot(time,z,'kx',lw=2)
plt.plot(time,xb,'g--',lw=3)
plt.plot(time,x2mhe,'k-',lw=3)
plt.plot(time,x1mhe,'r.-',lw=3)
plt.plot(time,xtrue,'k:',lw=2)
Added lines 186-187:

The solution shows the results of five different estimators including filtered bias update, Kalman filter, Implicit Dynamic Feedback, and Moving Horizon Estimation with a squared error or l1-norm objective.
January 28, 2018, at 06:15 AM by 174.148.61.237 -
Added lines 42-181:

(:toggle hide gekko button show="Show GEKKO (Python) Code":)
(:div id=gekko:)
(:source lang=python:)
from __future__ import division
from gekko import GEKKO
import numpy as np
import random

# intial parameters
n_iter = 150 # number of cycles
x = 37.727 # true value
# filtered bias update
alpha = 0.0951
# mhe tuning
horizon = 30

#%% Model

#Initialize model
m = GEKKO()

# Solve options
rmt = True # Remote: True or False
# For rmt=True, specify server
m.server = 'https://byu.apmonitor.com'

#time array
m.time = np.arange(50)

#Parameters
u = m.Param(value=42)
d = m.FV(value=0)
Cv = m.Param(value=1)
tau = m.Param(value=0.1)

#Variable
flow = m.CV(value=42)

#Equation
m.Equation(tau * flow.dt() == -flow + Cv * u + d)

# Options
m.options.imode = 5
m.options.ev_type = 1 #start with l1 norm
m.options.coldstart = 1
m.options.solver = 1  # APOPT solver

d.status = 1
flow.fstatus = 1
flow.wmeas = 100
flow.wmodel = 0
#flow.dcost = 0

# Initialize L1 application
m.solve(remote=rmt)

#%% Other Setup
# Create storage for results
xtrue = x * np.ones(n_iter+1)
z = x * np.ones(n_iter+1)
time = np.zeros(n_iter+1)
xb = np.empty(n_iter+1)
x1mhe = np.empty(n_iter+1)
x2mhe = np.empty(n_iter+1)

# initial estimator values
x0 = 40
xb[0] = x0
x1mhe[0] = x0
x2mhe[0] = x0

# outliers
for i in range(n_iter+1):
    z[i] = x + (random.random()-0.5)*2.0
z[50] = 100
z[100] = 0

#%% L1 Application

## Cycle through measurement sequentially
for k in range(1, n_iter+1):
    print( 'Cycle ' + str(k) + ' of ' + str(n_iter))
    time[k] = k
   
    # L1-norm MHE
    flow.meas = z[k]
    m.solve(remote=rmt)
    x1mhe[k] = flow.model

print("Finished L1")
#%% L2 application

#clear L1//
m.clear_data()
# Options for L2
m.options.ev_type = 2 #start with l1 norm
m.options.coldstart = 1 #reinitialize

flow.wmodel = 10

# Initialize L2 application
m.solve(remote=rmt)

## Cycle through measurement sequentially
for k in range(1, n_iter+1):
    print ('Cycle ' + str(k) + ' of ' + str(n_iter))
    time[k] = k
   
    # L2-norm MHE
    flow.meas = z[k]
    m.solve(remote=rmt)
    x2mhe[k] = flow.model
       
#%% Filtered bias update

## Cycle through measurement sequentially
for k in range(1, n_iter+1):
    print ('Cycle ' + str(k) + ' of ' + str(n_iter))
    time[k] = k

    # filtered bias update
    xb[k] = alpha * z[k] + (1.0-alpha) * xb[k-1]
   
   
#%% plot results
import matplotlib.pyplot as plt
plt.figure(1)
plt.plot(time,z,'kx',linewidth=2)
plt.plot(time,xb,'g--',linewidth=3)
plt.plot(time,x2mhe,'k-',linewidth=3)
plt.plot(time,x1mhe,'r.-',linewidth=3)
plt.plot(time,xtrue,'k:',linewidth=2)
plt.legend(['Measurement','Filtered Bias Update','Sq Error MHE','l_1-Norm MHE','Actual Value'])
plt.xlabel('Time (sec)')
plt.ylabel('Flow Rate (T/hr)')
plt.axis([0, time[n_iter], 32, 45])
plt.show()
(:sourceend:)
(:divend:)
Changed line 49 from:
# Hedengren, J. D., Advanced Process Monitoring, Chapter accepted to Optimization and Analytics in the Oil and Gas Industry, Eds. Kevin C. Furman, Jin-Hwa Song, Amr El-Bakry, Springer’s International Series in Operations Research and Management Science, 2014. [[https://apm.byu.edu/prism/uploads/Members/hedengren_apm2012.pdf|Article]]
to:
# Hedengren, J. D., Eaton, A. N., Overview of Estimation Methods for Industrial Dynamic Systems, Special Issue on Optimization in the Oil and Gas Industry, Optimization and Engineering, Springer, 2015, DOI: 10.1007/s11081-015-9295-9.  [[Attach:eaton_hedengren_OPTE_springer.pdf|Preprint]], [[https://link.springer.com/article/10.1007/s11081-015-9295-9|Article]]
May 06, 2015, at 03:43 AM by 45.56.3.184 -
Changed line 39 from:
Determine the effect of bad data (outliers, drift, and noise) on estimators such as such as moving horizon estimation. There is no need to design the estimators for this problem. The estimator scripts are below with sections that can be added to simulate the effect of bad data'^1^'. Only an outlier has been added to these codes. The codes should be modified to include other common phenomena such as measurement drift (gradual ramp away from the true value) and an increase in noise (random fluctuations).
to:
Determine the effect of bad data (outliers, drift, and noise) on estimators such as such as moving horizon estimation. There is no need to design the estimators for this problem. The estimator scripts are below with sections that can be added to simulate the effect of bad data'^1^'. Only an outlier has been added to these code. The code should be modified to include other common phenomena such as measurement drift (gradual ramp away from the true value) and an increase in noise (random fluctuations). Comment on the effect of corrupted data on real-time estimation and why some methods are more effective at rejecting bad data.
May 06, 2015, at 03:34 AM by 45.56.3.184 -
Changed line 39 from:
Determine the effect of bad data (outliers, drift, and noise) on estimators such as such as moving horizon estimation. There is no need to design the estimators for this problem. The estimator scripts are below with sections that can be added to simulate the effect of bad data'^1^'.
to:
Determine the effect of bad data (outliers, drift, and noise) on estimators such as such as moving horizon estimation. There is no need to design the estimators for this problem. The estimator scripts are below with sections that can be added to simulate the effect of bad data'^1^'. Only an outlier has been added to these codes. The codes should be modified to include other common phenomena such as measurement drift (gradual ramp away from the true value) and an increase in noise (random fluctuations).
May 06, 2015, at 03:28 AM by 45.56.3.184 -
Added lines 41-42:
Attach:download.png [[Attach:bad_data_exercise.zip|Estimation with Outliers in MATLAB and Python]]
Changed line 45 from:
Attach:download.png [[Attach:bad_data_exercise.zip|Estimation with Outliers in MATLAB and Python]]
to:
Attach:bad_data_estimation.png
May 05, 2015, at 11:17 PM by 10.5.113.199 -
Changed lines 39-40 from:
Determine the effect of bad data (outliers, drift, and noise) on estimators such as such as moving horizon estimation. There is no need to design the estimators for this problem. The estimator scripts are below with sections that can be added to simulate the effect of bad data.
to:
Determine the effect of bad data (outliers, drift, and noise) on estimators such as such as moving horizon estimation. There is no need to design the estimators for this problem. The estimator scripts are below with sections that can be added to simulate the effect of bad data'^1^'.
Added lines 44-47:

!!!! References

# Hedengren, J. D., Advanced Process Monitoring, Chapter accepted to Optimization and Analytics in the Oil and Gas Industry, Eds. Kevin C. Furman, Jin-Hwa Song, Amr El-Bakry, Springer’s International Series in Operations Research and Management Science, 2014. [[https://apm.byu.edu/prism/uploads/Members/hedengren_apm2012.pdf|Article]]
May 05, 2015, at 11:09 PM by 10.5.113.199 -
Changed line 39 from:
Determine the effect of bad data (outliers, drift, and noise) on estimators such as such as moving horizon estimation.
to:
Determine the effect of bad data (outliers, drift, and noise) on estimators such as such as moving horizon estimation. There is no need to design the estimators for this problem. The estimator scripts are below with sections that can be added to simulate the effect of bad data.
May 05, 2015, at 06:25 PM by 10.5.113.199 -
Changed lines 39-40 from:
Determine the effect of bad data (outliers, drift, and noise) on estimators such as bias updating, Kalman filter, and moving horizon estimation. Of particular interest is the difference between the inflow (''F'_1_''') and outflow (''F'_2_''').
to:
Determine the effect of bad data (outliers, drift, and noise) on estimators such as such as moving horizon estimation.
Changed line 43 from:
Attach:download.png [[Attach:bad_data_exercise.zip|Estimation with Bad Data in MATLAB]]
to:
Attach:download.png [[Attach:bad_data_exercise.zip|Estimation with Outliers in MATLAB and Python]]
May 05, 2015, at 06:09 AM by 45.56.3.184 -
Changed lines 33-38 from:
The flowrate of mud and cuttings is especially important with managed pressure drilling (MPD) in order to detect gas influx or fluid losses. There are a range of measurement instruments for flow such as a [[https://en.wikipedia.org/wiki/Mass_flow_meter|mass flow meter or Coriolis  flow meter (most accurate)]] and a [[https://en.wikipedia.org/wiki/Flow_measurement#Paddle_wheel_meter|paddle wheel (least accurate)]]. This particular valve has dynamics that are described by the following equation

 0.1 d
''F''/dt = -''F'' + C'_v_' ''u'' + ''d''

with C
'_v_'=1, ''u'' is the valve opening, and ''d'' is a disturbance.
to:
The flowrate of mud and cuttings is especially important with managed pressure drilling (MPD) in order to detect gas influx or fluid losses. There are a range of measurement instruments for flow such as a [[https://en.wikipedia.org/wiki/Mass_flow_meter|mass flow meter or Coriolis  flow meter (most accurate)]] and a [[https://en.wikipedia.org/wiki/Flow_measurement#Paddle_wheel_meter|paddle wheel (least accurate)]]. This particular system has dynamics that are described by the following equation with C'_v_'=1, ''u'' is the valve opening, and ''d'' is a disturbance.

 0.1 d
''F'_1_'''/dt = -''F'_1_''' + C'_v_' ''u'' + ''d''
Changed line 39 from:
Determine the effect of bad data (outliers, drift, and noise) on estimators such as bias updating, Kalman filter, and moving horizon estimation.
to:
Determine the effect of bad data (outliers, drift, and noise) on estimators such as bias updating, Kalman filter, and moving horizon estimation. Of particular interest is the difference between the inflow (''F'_1_''') and outflow (''F'_2_''').
May 05, 2015, at 12:14 AM by 10.5.113.199 -
Changed lines 33-34 from:
The flowrate of mud and cuttings is especially important with managed pressure drilling (MPD) in order to detect gas influx or fluid losses. There are a range of measurement instruments for flow such as a [[https://en.wikipedia.org/wiki/Mass_flow_meter|mass flow meter or Coriolis  flow meter (most accurate)]] and a [[https://en.wikipedia.org/wiki/Flow_measurement#Paddle_wheel_meter|paddle wheel (least accurate)]].
to:
The flowrate of mud and cuttings is especially important with managed pressure drilling (MPD) in order to detect gas influx or fluid losses. There are a range of measurement instruments for flow such as a [[https://en.wikipedia.org/wiki/Mass_flow_meter|mass flow meter or Coriolis  flow meter (most accurate)]] and a [[https://en.wikipedia.org/wiki/Flow_measurement#Paddle_wheel_meter|paddle wheel (least accurate)]]. This particular valve has dynamics that are described by the following equation

 0.1 d''F''/dt = -''F'' + C'_v_' ''u'' + ''d''

with C'_v_'=1, ''u'' is the valve opening, and ''d'' is a disturbance
.
Changed line 45 from:
Attach:download.png [[Attach:bad_data_exercise.zip|Estimation with Bad Data in MATLAB and Python]]
to:
Attach:download.png [[Attach:bad_data_exercise.zip|Estimation with Bad Data in MATLAB]]
May 04, 2015, at 06:37 PM by 45.56.3.184 -
Changed lines 27-41 from:
Some estimators and controllers are designed with ideal measurements in simulation but then fail to perform in practice due to the issues with real measurements. It is important to use methods that perform well in a variety of situations and can either reject or minimize the effect of bad data.
to:
Some estimators and controllers are designed with ideal measurements in simulation but then fail to perform in practice due to the issues with real measurements. It is important to use methods that perform well in a variety of situations and can either reject or minimize the effect of bad data.

!!!! Exercise

'''Objective:''' Understand the effect of bad data on dynamic optimization algorithms including estimator and control performance. Create a MATLAB or Python script to simulate and display the results. ''Estimated Time: 2 hours''

The flowrate of mud and cuttings is especially important with managed pressure drilling (MPD) in order to detect gas influx or fluid losses. There are a range of measurement instruments for flow such as a [[https://en.wikipedia.org/wiki/Mass_flow_meter|mass flow meter or Coriolis  flow meter (most accurate)]] and a [[https://en.wikipedia.org/wiki/Flow_measurement#Paddle_wheel_meter|paddle wheel (least accurate)]].

Attach:drilling_flowrate.png

Determine the effect of bad data (outliers, drift, and noise) on estimators such as bias updating, Kalman filter, and moving horizon estimation.

!!!! Solution

Attach:download.png [[Attach:bad_data_exercise.zip|Estimation with Bad Data in MATLAB and Python]]
May 04, 2015, at 04:36 PM by 45.56.3.184 -
Added lines 14-15:

!!!!Real Data Challenges
May 04, 2015, at 04:33 PM by 45.56.3.184 -
Added lines 20-21:

'''Figure 1.''' Example of (1) outlier, (2) drift, and (3) noise'^[[https://dx.doi.org/10.1016/j.compchemeng.2014.04.013|1]]^'.
May 04, 2015, at 04:21 PM by 45.56.3.184 -
Added lines 18-19:

Attach:bad_data.png
May 02, 2015, at 08:32 PM by 45.56.3.184 -
Added lines 15-21:
Real-data sources have a number of issues that can make simulation challenging. Measurements are used as inputs to a model, for parameter estimation, or in empirical model regression. Bad measurements can greatly affect the resulting model predictions, especially if strategies are not employed to minimize the effect of bad data.

A first step in data validation is gross error detection or when the data is clearly bad based on statistics or heuristics. Methods to automatically detect bad data include upper and lower validity limits and change validity limits. An example of a lower validity limit may be a requirement for positive (>0) values from a flow meter. Also, flow meters may not be able to detect flows above a certain limit, leading to an upper limit as well. An example of a change validity limit may be to detect sudden jumps in a measurement that are not realistic. For example, a gas chromatograph may suddenly report a jump in a gas concentration. If the gas chromatograph is measuring the concentration of a large gas phase polyethylene reactor, it is unrealistic for that concentration to change more than a certain rate. A change validity limit is able to catch these sudden shifts with gross error detection.

Other examples of real-data issues include outliers (infrequent data points that are temporarily outside of an otherwise consistent trend in the data), noise (random variations in the data due to resolution or variations in the measurement or transmission of the data), and drift (inaccurate and gradual increase or decrease of the measurement value that does not represent the true values). Data may also be infrequent (such as measurements that occur every few hours or not at regular intervals), intermittent (such as from unreliable measurements that report good values for only certain periods of time), or time delayed (measurements that are reported after a waiting period). Synchronization of real data to process models can be challenging for all of these reasons.

Some estimators and controllers are designed with ideal measurements in simulation but then fail to perform in practice due to the issues with real measurements. It is important to use methods that perform well in a variety of situations and can either reject or minimize the effect of bad data.
April 29, 2015, at 12:49 PM by 45.56.3.184 -
Deleted lines 14-16:
!!! Visualize time-varying data and predictions

(coming soon)
April 04, 2015, at 02:06 PM by 45.56.12.124 -
Added lines 4-17:

Data manipulation is important for dynamic optimization in order to set up simulations that utilize time-varying information. Several aspects of dynamic optimization involve the import, validation, filtering, manipulation, and display of large data sets. Select one of the following tutorials below on using MATLAB or Python to import, manipulate, and export data sets.

(:html:)
<iframe width="560" height="315" src="https://www.youtube.com/embed/E56egH10RJA" frameborder="0" allowfullscreen></iframe>
(:htmlend:)

(:html:)
<iframe width="560" height="315" src="https://www.youtube.com/embed/Tq6rCWPdXoQ" frameborder="0" allowfullscreen></iframe>
(:htmlend:)

!!! Visualize time-varying data and predictions

(coming soon)
Added lines 1-3:
(:title Dynamic Data Introduction:)
(:keywords dynamic data, validation, simulation, modeling language, differential, algebraic, tutorial:)
(:description Dynamic data for Differential Algebraic Equation (DAE) systems for use in dynamic simulation, estimation, and control:)