Processing math: 100%

Inverted Pendulum Optimal Control

Design a model predictive controller for an inverted pendulum system with an adjustable cart. Demonstrate that the cart can perform a sequence of moves to maneuver from position y=-1.0 to y=0.0 within 6.2 seconds. Verify that v, θ, and q are zero before and after the maneuver.

The inverted pendulum is described by the following dynamic equations:

[˙y˙v˙θ˙q]=[010000ϵ000010010][yvθq]+[0101]u

where u is the force applied to the cart, ε is m2/(m1+m2), y is the position of the cart, v is the velocity of the cart, θ is the angle of the pendulum relative to the cart, m1=10, m2=1, and q is the rate of angle change. Tune the controller to minimize the use of force applied to the cart either in the forward or reverse direction (i.e. minimize fuel consumed to perform the maneuver). Explain the tuning and the optimal solution with appropriate plots that demonstrate that the solution is optimal. The non-inverted pendulum problem and a nonlinear double inverted pendulum are additional examples.

Python (GEKKO) Solution

APM MATLAB and APM Python Solution

Response with Different Weights

OpenAI Cart Pole

OpenAI has a similar benchmark problem for testing Reinforcement Learning. Similar to the problem above, a pole is attached by a frictionless joint and cart that moves along a track. The cart is controlled by applying a force to the left (action=0) or right (action=1). The pendulum starts upright and a reward of +1 is provided for every timestep that the pole remains upright. The gym returns done=True when the pole is more than 15° from vertical or the cart is more than 2.4 units from the starting location.

import gym
env = gym.make('CartPole-v0')
env.reset()
for i in range(100):
    env.render()
    # Input:
    #   Force to the cart with actions: 0=left, 1=right
    # Returns:
    #   obs = cart position, cart velocity, pole angle, rot rate
    #   reward = +1 for every timestep
    #   done = True when abs(angle)>15 or abs(cart pos)>2.4
    obs,reward,done,info = env.step(env.action_space.sample())
env.close()

See additional information on Hand Tracking to control the cart position. Reinforcement Learning or Model Predictive Control can also be used to control the cart position (see Additional Reading).

Additional Reading

Streaming Chatbot
💬