Nonlinear and Multivariate Regression

Main.NonlinearRegression History

Hide minor edits - Show changes to output

Changed line 66 from:
m.Obj(((y-z)/z)**2)
to:
m.Minimize(((y-z)/z)**2)
August 13, 2020, at 01:01 PM by 136.36.211.159 -
Changed line 1 from:
(:title Nonlinear Regression with Energy Prices:)
to:
(:title Nonlinear and Multivariate Regression:)
Changed lines 5-6 from:
Predict the price of oil (OIL) from indicators such as the West Texas Intermediate (WTI) price, Henry Hub gas price (HH), and the Mont Belvieu (MB) propane spot price. Data is available for OIL, WTI, HH, and MB from the years 2000 to 2016 at the following link.
to:
'''Objective:''' Perform nonlinear and multivariate regression on energy data to predict oil price.

Predictors are data features that are inputs to calculate a predicted output. In machine learning the data inputs are called features and the measured outputs are called labels. Regression is the method of adjusting parameters in a model to minimize the difference between the predicted output and the measured output. The objective of this problem is to predict the price of oil (OIL) from indicator features that include
West Texas Intermediate (WTI) price, Henry Hub gas price (HH), and the Mont Belvieu (MB) propane spot price. Data is available for OIL, WTI, HH, and MB from the years 2000 to 2016 at the following link.
Added lines 214-215:

There is additional information about regression in the [[https://apmonitor.github.io/data_science|Data Science Online Course]].
June 21, 2020, at 04:44 AM by 136.36.211.159 -
Deleted lines 211-229:

----

(:html:)
 <div id="disqus_thread"></div>
    <script type="text/javascript">
        /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */
        var disqus_shortname = 'apmonitor'; // required: replace example with your forum shortname

        /* * * DON'T EDIT BELOW THIS LINE * * */
        (function() {
            var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
            dsq.src = 'https://' + disqus_shortname + '.disqus.com/embed.js';
            (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
        })();
    </script>
    <noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
    <a href="https://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a>
(:htmlend:)
March 06, 2018, at 12:20 AM by 10.5.113.199 -
Changed lines 13-17 from:
This particular nonlinear equation can be transformed to a linear equation with a log transformation as {`\log(OIL)=\log(A)+B\log(WTI)+C\log(HH)+D\log(MB)`} or kept in the original nonlinear form. Adjust the unknown parameters (''A'', ''B'', ''C'', ''D'') to minimize a sum of squared errors of the normalized difference between the measured and predicted value. Normalize the difference by the measured value before the it is squared.
to:
This particular nonlinear equation can be transformed to a linear equation with a log transformation as

{$\log(OIL)=\log(A)+B\log(WTI)+C\log(HH)+D\log(MB)$}

or kept in the original nonlinear form. Adjust the unknown parameters (''A'', ''B'', ''C'', ''D'') to minimize a sum of squared errors of the normalized difference between the measured and predicted value. Normalize the difference by the measured value before the it is squared.
March 06, 2018, at 12:19 AM by 10.5.113.199 -
Changed line 13 from:
This particular nonlinear equation can be transformed to a linear equation with a log transformation as {`\log(OIL)=\log(A)+B\,\log(WTI)+C\,log(HH)+D\,log(MB)`} or kept in the original nonlinear form. Adjust the unknown parameters (''A'', ''B'', ''C'', ''D'') to minimize a sum of squared errors of the normalized difference between the measured and predicted value. Normalize the difference by the measured value before the it is squared.
to:
This particular nonlinear equation can be transformed to a linear equation with a log transformation as {`\log(OIL)=\log(A)+B\log(WTI)+C\log(HH)+D\log(MB)`} or kept in the original nonlinear form. Adjust the unknown parameters (''A'', ''B'', ''C'', ''D'') to minimize a sum of squared errors of the normalized difference between the measured and predicted value. Normalize the difference by the measured value before the it is squared.
March 06, 2018, at 12:19 AM by 10.5.113.199 -
Changed line 13 from:
Adjust the unknown parameters (''A'', ''B'', ''C'', ''D'') to minimize a sum of squared errors of the normalized difference between the measured and predicted value. Normalize the difference by the measured value before the it is squared.
to:
This particular nonlinear equation can be transformed to a linear equation with a log transformation as {`\log(OIL)=\log(A)+B\,\log(WTI)+C\,log(HH)+D\,log(MB)`} or kept in the original nonlinear form. Adjust the unknown parameters (''A'', ''B'', ''C'', ''D'') to minimize a sum of squared errors of the normalized difference between the measured and predicted value. Normalize the difference by the measured value before the it is squared.
Added lines 18-21:

(:html:)
<iframe width="560" height="315" src="https://www.youtube.com/embed/BSwm2ZSstEY" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe>
(:htmlend:)
March 05, 2018, at 04:27 PM by 45.56.3.173 -
Changed line 1 from:
(:title Nonlinear Regression with Energy Price Example:)
to:
(:title Nonlinear Regression with Energy Prices:)
March 05, 2018, at 04:24 PM by 45.56.3.173 -
Changed lines 19-20 from:
(:toggle hide gekko button show="Show Python (GEKKO) Code":)
(:div id=gekko:)
to:
!!!! Python (GEKKO) Solution

%width
=550px%Attach:nonlinear_regression.png
Added line 29:
# use 'pip install gekko' to get package
Changed lines 99-101 from:
(:divend:)

(:toggle hide scipy button show="Show Python (SciPy) Code":)
to:

!!!! Python (SciPy) Solution
Changed lines 104-200 from:
to:
# Energy price non-linear regression
# solve for oil sales price (outcome)
# using 3 predictors of WTI Oil Price,
#  Henry Hub Price and MB Propane Spot Price
import numpy as np
from scipy.optimize import minimize
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# data file from URL address
data = 'https://apmonitor.com/me575/uploads/Main/oil_data.txt'
df = pd.read_csv(data)

xm1 = np.array(df["WTI_PRICE"])  # WTI Oil Price
xm2 = np.array(df["HH_PRICE"])  # Henry Hub Gas Price
xm3 = np.array(df["NGL_PRICE"])  # MB Propane Spot Price
ym = np.array(df["BEST_PRICE"])  # oil sales price received (outcome)

# calculate y
def calc_y(x):
    a = x[0]
    b = x[1]
    c = x[2]
    d = x[3]
    #y = a * xm1 + b  # linear regression
    y = a * ( xm1 ** b ) * ( xm2 ** c ) * ( xm3 ** d )
    return y

# define objective
def objective(x):
    # calculate y
    y = calc_y(x)
    # calculate objective
    obj = 0.0
    for i in range(len(ym)):
        obj = obj + ((y[i]-ym[i])/ym[i])**2   
    # return result
    return obj

# initial guesses
x0 = np.zeros(4)
x0[0] = 0.0 # a
x0[1] = 0.0 # b
x0[2] = 0.0 # c
x0[3] = 0.0 # d

# show initial objective
print('Initial Objective: ' + str(objective(x0)))

# optimize
# bounds on variables
my_bnds = (-100.0, 100.0)
bnds = (my_bnds, my_bnds, my_bnds, my_bnds)
solution = minimize(objective, x0, method='SLSQP', bounds=bnds)
x = solution.x
y = calc_y(x)

# show final objective
cObjective = 'Final Objective: ' + str(objective(x))
print(cObjective)

# print solution
print('Solution')

cA = 'A = ' + str(x[0])
print(cA)
cB = 'B = ' + str(x[1])
print(cB)
cC = 'C = ' + str(x[2])
print(cC)
cD = 'D = ' + str(x[3])
print(cD)

cFormula = "Formula is : " + "\n" \
          + "A * WTI^B * HH^C * PROPANE^D"
cLegend = cFormula + "\n" + cA + "\n" + cB + "\n" \
          + cC + "\n" + cD + "\n" + cObjective

#ym measured outcome
#y  predicted outcome
     
from scipy import stats
slope, intercept, r_value, p_value, std_err = stats.linregress(ym,y)
r2 = r_value**2
cR2 = "R^2 correlation = " + str(r_value**2)
print(cR2)

# plot solution
plt.figure(1)
plt.title('Actual (YM) versus Predicted (Y) Outcomes For Non-Linear Regression')
plt.plot(ym,y,'o')
plt.xlabel('Measured Outcome (YM)')
plt.ylabel('Predicted Outcome (Y)')
plt.legend([cLegend])
plt.grid(True)
plt.show()
Deleted line 201:
(:divend:)
March 05, 2018, at 04:16 PM by 45.56.3.173 -
Changed lines 98-99 from:
(:toggle hide gekko button show="Show Python (SciPy) Code":)
(:div id=gekko:)
to:
(:toggle hide scipy button show="Show Python (SciPy) Code":)
(:div id=scipy:)
March 05, 2018, at 04:13 PM by 45.56.3.173 -
Changed lines 13-14 from:
Adjust the unknown parameters (''A'', ''B'', ''C'', ''D'') to minimize a sum of squared errors of the normalized difference between the measured and predicted value. Normalize the difference by the measured value before the it is squared. Report the parameter values, the R'^2^' value of fit, and display a plot of the results.
to:
Adjust the unknown parameters (''A'', ''B'', ''C'', ''D'') to minimize a sum of squared errors of the normalized difference between the measured and predicted value. Normalize the difference by the measured value before the it is squared.

{$\min_{A,B,C,D} \sum_{i=1}^n \left( \frac{OIL_{pred,i}-OIL_{meas,i}}{OIL_{meas,i}} \right)^2$}

where ''n'' is the number of data points, ''i'' is an index for the current measured value, ''pred'' is the predicted value, and ''meas'' is the measured value
. Report the parameter values, the R'^2^' value of fit, and display a plot of the results.
Changed lines 22-94 from:
to:
# Energy price non-linear regression
# solve for oil sales price (outcome)
# using 3 predictors of WTI Oil Price,
#  Henry Hub Price and MB Propane Spot Price
import numpy as np
from gekko import GEKKO
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# data file from URL address
data = 'https://apmonitor.com/me575/uploads/Main/oil_data.txt'
df = pd.read_csv(data)

xm1 = np.array(df["WTI_PRICE"]) # WTI Oil Price
xm2 = np.array(df["HH_PRICE"])  # Henry Hub Gas Price
xm3 = np.array(df["NGL_PRICE"]) # MB Propane Spot Price
ym = np.array(df["BEST_PRICE"]) # oil sales price

# GEKKO model
m = GEKKO()
a = m.FV(lb=-100.0,ub=100.0)
b = m.FV(lb=-100.0,ub=100.0)
c = m.FV(lb=-100.0,ub=100.0)
d = m.FV(lb=-100.0,ub=100.0)
x1 = m.Param(value=xm1)
x2 = m.Param(value=xm2)
x3 = m.Param(value=xm3)
z = m.Param(value=ym)
y = m.Var()
m.Equation(y==a*(x1**b)*(x2**c)*(x3**d))
m.Obj(((y-z)/z)**2)
# Options
a.STATUS = 1
b.STATUS = 1
c.STATUS = 1
d.STATUS = 1
m.options.IMODE = 2
m.options.SOLVER = 1
# Solve
m.solve()

print('a: ', a.value[0])
print('b: ', b.value[0])
print('c: ', c.value[0])
print('d: ', d.value[0])

cFormula = "Formula is : " + "\n" + \
          r"$A * WTI^B * HH^C * PROPANE^D$"

from scipy import stats
slope, intercept, r_value, p_value, \
      std_err = stats.linregress(ym,y)

r2 = r_value**2
cR2 = "R^2 correlation = " + str(r_value**2)
print(cR2)

# plot solution
plt.figure(1)
plt.plot([20,140],[20,140],'k-',label='Measured')
plt.plot(ym,y,'ro',label='Predicted')
plt.xlabel('Measured Outcome (YM)')
plt.ylabel('Predicted Outcome (Y)')
plt.legend(loc='best')
plt.text(25,115,'a =' + str(a.value[0]))
plt.text(25,110,'b =' + str(b.value[0]))
plt.text(25,105,'c =' + str(c.value[0]))
plt.text(25,100,'d =' + str(d.value[0]))
plt.text(25,90,r'$R^2$ =' + str(r_value**2))
plt.text(80,40,cFormula)
plt.grid(True)
plt.show()
March 05, 2018, at 04:08 PM by 45.56.3.173 -
Changed line 11 from:
{$OIL = A \, WTI^B \, HH^C \, MB^D$}
to:
{$OIL = A \, \left(WTI^B\right) \, \left(HH^C\right) \, \left(MB^D\right)$}
March 05, 2018, at 04:04 PM by 45.56.3.173 -
Changed line 7 from:
Attach:download.png [[Attach:oil_data.csv|Oil Data (oil_data.csv)]]
to:
Attach:download.png [[Attach:oil_data.txt|Oil Data (oil_data.txt)]]
March 05, 2018, at 04:02 PM by 45.56.3.173 -
Added lines 1-48:
(:title Nonlinear Regression with Energy Price Example:)
(:keywords Nonlinear Regression, Factors, Multivariate, Optimization, Constraint, Nonlinear Programming:)
(:description Perform nonlinear regression on energy data to predict oil price.:)

Predict the price of oil (OIL) from indicators such as the West Texas Intermediate (WTI) price, Henry Hub gas price (HH), and the Mont Belvieu (MB) propane spot price. Data is available for OIL, WTI, HH, and MB from the years 2000 to 2016 at the following link.

Attach:download.png [[Attach:oil_data.csv|Oil Data (oil_data.csv)]]

Use the following nonlinear correlation with unknown parameters ''A'', ''B'', ''C'', and ''D''.

{$OIL = A \, WTI^B \, HH^C \, MB^D$}

Adjust the unknown parameters (''A'', ''B'', ''C'', ''D'') to minimize a sum of squared errors of the normalized difference between the measured and predicted value. Normalize the difference by the measured value before the it is squared. Report the parameter values, the R'^2^' value of fit, and display a plot of the results.

(:toggle hide gekko button show="Show Python (GEKKO) Code":)
(:div id=gekko:)
(:source lang=python:)

(:sourceend:)
(:divend:)

(:toggle hide gekko button show="Show Python (SciPy) Code":)
(:div id=gekko:)
(:source lang=python:)

(:sourceend:)
(:divend:)

Thanks to [[https://www.linkedin.com/in/fulton-loebel-5b753a25/|Fulton Loebel]] for submitting this example problem to the [[https://apmonitor.com/wiki/index.php/Main/UsersGroup|APMonitor Discussion Forum]].

----

(:html:)
 <div id="disqus_thread"></div>
    <script type="text/javascript">
        /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */
        var disqus_shortname = 'apmonitor'; // required: replace example with your forum shortname

        /* * * DON'T EDIT BELOW THIS LINE * * */
        (function() {
            var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
            dsq.src = 'https://' + disqus_shortname + '.disqus.com/embed.js';
            (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
        })();
    </script>
    <noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
    <a href="https://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a>
(:htmlend:)