當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

最优控制理论八、CasADi求解路径约束轨迹优化的多重打靶法

發(fā)布時間：2023/12/20 编程问答 24 豆豆

生活随笔收集整理的這篇文章主要介紹了最优控制理论八、CasADi求解路径约束轨迹优化的多重打靶法小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

上一篇博客我總結(jié)了目前工程中求解最優(yōu)控制問題的常用數(shù)值方法，最優(yōu)控制理論七、關于數(shù)值求解算法，作為本節(jié)的基礎性知識。
本博客利用CasADi的MATLAB版求解一個帶有路徑約束、終端約束的開環(huán)最優(yōu)控制問題，采用最常用的多重打靶法，其中CasADi的功能在于對決策變量、約束變量進行自動微分，從而加快非線性規(guī)劃的收斂速度。

受路徑約束的最速下降曲線問題

以下使用了一個受約束的最速下降曲線問題，求解從點 $O$ 到 $P$ 的轉(zhuǎn)移曲線，使其服從無摩擦理想情況重力下降的動力學約束 $x˙=f[(x(t),θ(t),t]\dot{\mathbf x}=f[({\mathbf x}(t),\theta(t),t]$ ，以及路徑約束 $y≤1/2x+1y\leq 1/2 x+1$ . 具體描述為：

定義狀態(tài)變量為 $x=[x,y,v]T\mathbf x=[x,y,v]^\mathrm T$ ，已知始末端點 $x(t0)=[0,0,0]T,x(tf)=[2,2,free]T\mathbf x(t_0)=[0,0,0]^\mathrm T,\ \mathbf x(t_f)=[2,2,free]^\mathrm T$ , 轉(zhuǎn)移路徑服從動力學約束 ${x˙=vcos?θy˙=vsin?θv˙=gsin?θ\left\{\begin{array}{l} \dot{x}=v\cos\theta\\ \dot{y}=v \sin \theta \\ \dot{v}=g \sin \theta \end{array}\right.$

求最優(yōu)的曲線形狀(彈道傾角) $θ(t)\theta(t)$ 使轉(zhuǎn)移時間 $J=∫t0tf1dtJ=\int_{t_{0}}^{t_{f}}1 d t$ 最小。終端約束 $ψ(x(tf))=[xf?2,yf?2]=0\psi(\mathbf x(t_f))=[x_f-2,y_f-2]=0$ .路徑約束 $S[x(t),t]=y?1/2?x?1≤0S[\mathbf x(t),t]=y-1/2\cdot x-1\leq0$

由于最短時間問題直接求解不太容易，因此需要對原問題進行變換、

對自變量時間 $t$ 乘以一個到達時間 $t_f$ 的變換后可轉(zhuǎn)化成固定終端時間的OCP，具體步驟如下：
$x˙=dxdt=dxdτdτdt\dot{x}=\frac{d x}{d t}=\frac{d x}{d \tau} \frac{d \tau}{d t}\\$

定義： $()′=d(?)dτ,dtdτ=tf()^{\prime}=\frac{d(\cdot)}{d \tau}, \frac{d t}{d \tau}=t_f$

則 $τ∈[0,1]\tau \in[0,1]$ 變化時，有 $\in[0, t_f]$ . ${x˙=vcos?θy˙=vsin?θv˙=gsin?θ?{x′=τvcos?θy′=τvsin?θv′=τgsin?θ\left\{\begin{array} { l } { \dot x = v \cos\theta } \\ { \dot { y } = v \sin \theta } \\ { \dot { v } = g \sin \theta } \end{array} \Rightarrow \left\{\begin{array}{l} x^{\prime}=\tau v \cos \theta \\ y^{\prime}=\tau v \sin \theta \\ v^{\prime}=\tau g \sin \theta \end{array}\right.\right.$

OCP求解的DMS法

最優(yōu)控制問題的求解方法很多，以下我們就利用直接多重打靶法（direct multiple shooting，DMS），離散動力學方程，然后用RK4求解每一段的ODE初值問題，最終把問題轉(zhuǎn)化為多約束非線性規(guī)劃問題。

這里決策變量為每一段的初值狀態(tài) $xi,i=0,1,?N\mathbf x_i,i=0,1,\cdots N$ 和每一段近似保持不變的彈道傾角 $θi,i=0,1,?N\theta_i,i=0,1,\cdots N$ . 算法DMS相對DSS的優(yōu)勢是，它有更多決策變量以及更多等式約束，但是由于每個決策變量對系統(tǒng)cost函數(shù)的影響是稀疏性的，因此OCP更容易收斂。

利用CasADi求解最優(yōu)控制問題

CasADi是一個開源的非線性優(yōu)化求解工具包，有MATLAB、Python、C++等主流數(shù)值優(yōu)化語言的版本，支持各種平臺，其官網(wǎng)https://web.casadi.org/。它的一部分特性包括：基于符號變量（symbolic variable）的解析計算、可以自動求差分（Automatic Differentiation），因此非常適合高效求解大規(guī)模非線性規(guī)劃問題。

這里采用自動微分進行優(yōu)化問題求解的原因在于，本問題中我沒有手動提供Jacobian矩陣，而CasADi自動求了微分，就可以避免利用前向差分法數(shù)值求解導數(shù)而產(chǎn)生的計算問題。

下載CasADi v3.5.5最新版本的MATLAB版本，然后將其解壓，并使MATLAB添加到路徑>整個文件夾，然后再包含內(nèi)部的+casadi文件夾，這樣就算安裝完成了。

然后下載官方的用戶學習案例Example package ，我們接下來要在他的基礎上進行求解。

求解NLP問題的一個案例

由于CasADi是直接拿來求解NLP問題的，它的標準形式可以參考以下這個文件，也就是example/matlab/rosenbrock.m，是一個求解非線性規(guī)劃問題的求解程序。

% % an example of Nolinear programming % This file is part of CasADi in directory: example/matlab/rosenbrock.m % % % Load CasADi import casadi.*% Create NLP: Solve the Rosenbrock problem: % minimize x^2 + 100*z^2 % subject to z + (1-x)^2 - y == 0 x = SX.sym('x'); y = SX.sym('y'); z = SX.sym('z'); v = [x;y;z]; % decision variables f = x^2 + 100*z^2; % cost function g = z + (1-x)^2 - y; % constraints nlp = struct('x', v, 'f', f', 'g', g);% Create IPOPT solver object solver = nlpsol('solver', 'ipopt', nlp);% Solve the NLP res = solver('x0' , [2.5 3.0 0.75],... % solution guess'lbx', -inf,... % lower bound on x'ubx', inf,... % upper bound on x'lbg', 0,... % lower bound on g'ubg', 0); % upper bound on g% Print the solution f_opt = full(res.f) % >> 0 x_opt = full(res.x) % >> [0; 1; 0] lam_x_opt = full(res.lam_x) % >> [0; 0; 0] lam_g_opt = full(res.lam_g) % >> 0

這個程序非常簡單，我們可以用來代替MATLAB@fmincon函數(shù)，作為以下求解OCP問題的參考。

直接轉(zhuǎn)化多重打靶法求解OCP的案例

由于其官網(wǎng)上一個例子和我的需求非常像，因此我在這里就直接拿過來改一改，位于CasADi案例文件夾example\matlab下的direct_multiple_shooting.m腳本。

我這里的狀態(tài) $x$ 初始猜測都是設定的0，相當于是沒有設定初值的；控制初始猜測設為 $π/4\pi/4$ ，也就相當于一條直線下降。雖然我這里收斂了，但是一些復雜的路徑約束就不一定，那些情況下提供初始猜測是必要的。

% optimal control problem :path constrained Brachistochrone % uses direct transcription and multiple shooting % This file is modified from \CasADi\example\matlab\direct_multiple_shooting.m % % Based on CasADi -- A symbolic framework for dynamic optimization.% An implementation of direct multiple shooting % Siyang Meng .2021/8/1 %% 原問題設定 clear,clc import casadi.*T = 1; % Time horizon 0-1 N = 20; % number of control intervals% Declare model variables x = SX.sym('x'); y = SX.sym('y'); v = SX.sym('v');X = [x; y; v]; u = SX.sym('theta'); tau = SX.sym('tau');% time scale % Model equations xdot = [tau*v*cos(u);tau*v*sin(u);tau*9.8*sin(u)];% Objective term in lagrange form % 由于是最短時間問題，因此Lagrange項和DynamicalModel都乘以時間縮放系數(shù)scale L = tau;%% Formulate discrete time dynamics if false% CVODES from the SUNDIALS suitedae = struct('x',x,'p',u,'ode',xdot,'quad',L);opts = struct('tf',T/N);F = integrator('F', 'cvodes', dae, opts); else% Fixed step Runge-Kutta 4 integratorM = 4; % RK4 steps per intervalDT = T/N/M;% Continuous time dynamicsf = Function('f', {X, u, tau}, {xdot, L});X0 = MX.sym('X0', 3);Udec = MX.sym('U');tdec = MX.sym('tf');X = X0;Q = 0;for j=1:M[k1, k1_q] = f(X, Udec, tdec);[k2, k2_q] = f(X + DT/2 * k1, Udec, tdec);[k3, k3_q] = f(X + DT/2 * k2, Udec, tdec);[k4, k4_q] = f(X + DT * k3, Udec, tdec);X = X+DT/6*(k1 +2*k2 +2*k3 +k4);Q = Q + DT/6*(k1_q + 2*k2_q + 2*k3_q + k4_q);end% 離散化后的ODE系統(tǒng)F = Function('F', {X0, Udec, tdec}, {X, Q}, {'xin','uin', 'tauin'}, {'xf', 'qf'}); end% 測試一下動力學方程代入數(shù)值會是正常輸出嗎。 % Evaluate at a test point Fk = F('xin',[0; 0; 0],'uin',pi/4, 'tauin',0.8); disp(Fk.xf) disp(Fk.qf) %% 構(gòu)造CasADi可求解的NLP問題，初始化，% Start with an empty NLP w={}; w0 = [];% initial guess lbw = [];% lower bound ubw = [];% upper bound J = 0; g={};% constraints of NLP lbg = []; ubg = [];% 動力學方程的初始條件 % "Lift" initial conditions tau = MX.sym('tau'); Xk = MX.sym('X0', 3); % w是決策變量 w = {tau, ...w{:}, Xk}; % 初始狀態(tài)的上下限約束 lbw = [0.6;lbw;0; 0; 0]; ubw = [0.9;ubw; 0; 0; 0]; % 初始狀態(tài)的猜測 w0 = [0.8;w0; 0; 0; 0];%% 下面設定的包含路徑約束、連續(xù)性等式約束、以及終端約束 % Formulate the NLP for k=0:N-1% 求解IVP到此的時間。tk = T/N*(k+1);% 控制輸入離散化的決策變量% New NLP variable for the controlUk = MX.sym(['U_' num2str(k)]);w = {w{:}, Uk};% 控制theta% 上下界lbw = [lbw; 0];ubw = [ubw;2*pi];% 猜測w0 = [w0;pi/4];%控制初值給定% Integrate till the end of the intervalFk = F('xin', Xk, 'uin', Uk, 'tauin', tau);Xk_end = Fk.xf;J=J+Fk.qf;% 狀態(tài)變量離散化的決策變量% New NLP variable for state at end of intervalXk = MX.sym(['X_' num2str(k+1)], 3);w = [w, {Xk}];% 路徑X% 上下界lbw = [lbw;-inf; -inf; -inf];ubw = [ubw;inf; inf; inf];% 猜測w0 = [w0;0; 0; 0];% 提供初始猜測時，這里可以添加插值表% Inequality constraintg = [g,...{Xk_end-Xk},...{-1/2*Xk_end(1)+Xk_end(2)-1}];% y>2* x-2% 上下界lbg = [lbg;0; 0; 0;-inf];ubg = [ubg;0; 0; 0;0]; end% Add terminal equality constraint g = [g, ...{Xk_end},...{-1/2*Xk_end(1)+Xk_end(2)-1}];% y>2* x-2 lbg = [lbg;2; 2; -inf;-inf]; ubg = [ubg;2; 2; inf;0]; %% 設定并求解DMS % Create an NLP solver % 整理一下cell array prob = struct('f', J, 'x', vertcat(w{:}), 'g', vertcat(g{:})); solver = nlpsol('solver', 'ipopt', prob);% Solve the NLP sol = solver('x0', w0,...% 決策變量的初始猜測'lbx', lbw,...'ubx', ubw,...% 決策變量的上下界約束'lbg', lbg,...'ubg', ubg);% 非線性路徑約束的上下界約束 w_opt = full(sol.x);%% Plot the solution % 決策變量的格式是前面已經(jīng)確定下來的，只需要把它提取出來，格式如下， %{[tau;X0; U_0;X_1; U_1; X_2; U_2; ...;X_N-1; U_N-1;X_N]; % 在經(jīng)過 vertcat(w{:})之后，變成 1 + 3*(N+1) + 1*N 的 casadi.MX 符號向量 %} minT = w_opt(1),% 這是求解出的最短時間x1_opt = w_opt(2:4:end); x2_opt = w_opt(3:4:end); x3_opt = w_opt(4:4:end); u_opt = w_opt(5:4:end);tgrid = minT * linspace(0, T, N+1); clf;figure(1) plot(x1_opt, x2_opt, 'k-'),hold on; plot([0,2],[0,2], 'ko'), plot([0,2],[1,2], 'b--'), axis([0 2 0 2]) xlabel('x'),ylabel('y'),title('trajectory')figure(2) stairs(tgrid, [u_opt; nan], 'k-.') xlabel('otpimal time/ s'),ylabel('optimal control angle/ rad'); legend('\theta')

運行結(jié)果如下

EXIT: Optimal Solution Found.solver : t_proc (avg) t_wall (avg) n_evalnlp_f | 1.00ms ( 58.82us) 1.04ms ( 61.18us) 17nlp_g | 7.00ms (411.76us) 6.01ms (353.41us) 17nlp_grad_f | 9.00ms (529.41us) 9.48ms (557.59us) 17nlp_hess_l | 28.00ms ( 1.87ms) 26.42ms ( 1.76ms) 15nlp_jac_g | 18.00ms ( 1.06ms) 19.10ms ( 1.12ms) 17total | 301.00ms (301.00ms) 301.43ms (301.43ms) 1 minT =0.8250

這個運行結(jié)果比fmincon快多了。

直接配點多重打靶法

由于轉(zhuǎn)化法是的計算效率比較低，又發(fā)展出了配點法，其核心思路是在連續(xù)幾個配點節(jié)點上用插值多項式近似動力學方程，然后把動力學方程的連續(xù)性約束轉(zhuǎn)化為配點節(jié)點上的連續(xù)性、以及多項式導數(shù)的連續(xù)性。采用配點法的求解效率高于直接轉(zhuǎn)化法，當然它仍然存在它的缺點，不過對于大多數(shù)最優(yōu)控制問題來說，是高效而適宜求解的。有關這兩種方法的比較，可以參考Moritz Diehl[3]的筆記Numerical Optimal Control，在Sec.11.2與Sec11.3有相關描述。值得指出的是，目前，直接配點法（direct collacation）已經(jīng)成為求解最優(yōu)控制問題的最流行的方法，很多軟件包都是默認直接采用這種求解方法的。

以下我又把上面的問題，給出了配點法的求解程序。

這里采用了基于CasADi的開源最優(yōu)控制求解軟件Yop，這個軟件當然不是最好用的，但是確實很直觀。下面我從它的官網(wǎng)摘一些必要的信息：

ODE或DAE系統(tǒng)的建立

mySystem = YopSystem(...'states', numberOfStates, ...'algebraics', numberOfAlgebraics, ...'controls', numberOfControlInputs, ...'externals', numberOfExternalInputs, ...'parameters', numberOfParameters, ...'model', @myModelFunction ...)%% model function [dx, alg, signals] = myModelFunction(t, x, z, u, w, p) % x: states % z: algebraics % u: controls % w: externals % p: parameters end

性能指標以及最優(yōu)控制問題的建立

% An optimal control problem ocp = YopOcp();% Objective function to minimize ocp.min({ t_f( endStateCost ) '+' timeIntegral( integralCost ) })% or if the aim is to maximize the objective: ocp.max({ t_f( endStateCost ) '+' timeIntegral( integralCost ) })

路徑約束的給定

% Subject to the constraints ocp.st(...'system', mySystem, ... % a YopSystem on ODE form{ t0lb '<=' t_0(t) '<=' t0ub }, ... % independent variable initial value bounds{ tflb '<=' t_f(t) '<=' tfub }, ... % independent variable final value bounds{ x_min '<=' x '<=' x_max }, ... % state bounds{ x0_min '<=' t_0(x) '<=' x0_max }, ... % state initial value bounds{ xf_min '<=' t_f(x) '<=' xf_max }, ... % state final value bounds{ u_min '<=' u '<=' u_max }, ... % control bounds{ u0_min '<=' t_0(u) '<=' u0_max }, ... % control initial value bounds % Upper bounded { ub '>=' expression } { ub '>=' t_0( expression ) } { ub '>=' t_f( expression ) } % Equality constraint { eq '==' expression } { eq '==' t_0( expression ) } { eq '==' t_f( expression ) });

求解器設定

sol = ocp.solve( ...'initialGuess', initialGuess, ...'controlIntervals', 50, ...'collocationPoints', 'legendre', ...'polynomialDegree', 3, ...'ipopt', struct('max_iter', 5000) ...);

可以看到Y(jié)op工具箱的操作模式非常直觀，安裝它也很簡單，只需要在GitHub上下載然后用MATLAB包含它的路徑即可。以下就是我寫的一個求解程序，這里的路徑約束改成了避障約束
$(x?1)2+(y?1)2≥14(x-1)^2+(y-1)^2\geq\frac1 4$

% optimal control problem :path constrained Brachistochrone % uses direct collacation and multiple shooting % Based on Yop -- a open source optimal control toolbox % it uses multiple shooting collcation method % CasADi -- A symbolic framework for dynamic optimization. % it provides automatic differential for NLP solvers % editor: Siyang Meng, from Northwestern Polytechnical University % 2021年8月14日22:23:15 clear,clc %% % Create the Yop system sys = YopSystem('states', 3, 'controls', 1); % Symbolic variables t = sys.t; x = sys.x; u = sys.u;% Model xdot = [x(3)*cos(u);x(3)*sin(u);9.8*sin(u)];sys.set('ode',xdot)Nodes = 15;t_guess = linspace(0,1,Nodes); y_guess = 2*t_guess; x_guess = 1/2*y_guess.^2; u_guess = pi/4*ones(1,Nodes); w0 = YopInitialGuess(...'signals', [t; x; u], ... % Time-varying variables'signalValues', [t_guess; [x_guess;y_guess;t_guess]; u_guess] ... % Guess values );% An optimal control problem ocp = YopOcp();% Objective function to minimize % ocp.min({ timeIntegral( 1/2*trolley.acceleration^2 ) }); ocp.min({ t_f( t ) });% Subject to the constraints ocp.st(...'systems', sys, ...... % Initial conditions{ 0 '==' t_0( x(1) ) }, ...{ 0 '==' t_0(x(2)) }, ...{ 0 '==' t_0(x(3)) }, ...... % Constraints{0 '<=' x(3) '<=' inf }, ...{ 0 '<=' u '<=' pi/2 }, ...... % Final conditions{ 2 '==' t_f( x(1) ) }, ...{ 2 '==' t_f( x(2) ) }, ...... % Path Constraintsstrict({ 1/4 '<=' (x(2)-1)^2+(x(1)-1)^2 }) ...% -(Xk_end(1)-1)^2-(Xk_end(2)-1)^2+1/4 );% Solving the OCP Nodes = 100; sol = ocp.solve(...'controlIntervals', Nodes, ...'collocationPoints', 'radau', ...'polynomialDegree', 1, ...'initialGuess', w0);%% sol.NumericalResults.Objective figure(1) sol.plot(sys.x(1), sys.x(2), 'k-'),hold on; plot([0,2],[0,2], 'ko'), % plot([0,2],[1,2], 'b--'), plot(1/2*cos([0:0.1:6.3])+1,1/2*sin([0:0.1:6.3])+1,'r--');plot(x_guess,y_guess,'b--');axis([0 2 0 2]),xlabel('x'),ylabel('y'), legend('optimal trj','boundary points','path constraints','initial guess'); title('trajectory')figure(2) sol.stairs(sys.t, sys.u) xlabel('otpimal time/ s'),ylabel('optimal control angle/ rad'); legend('\theta')

以下是求解結(jié)果：

可以發(fā)現(xiàn)如果初值給的沒問題的話，就能正確滿足力學約束，且最優(yōu)控制是正確的。求解過程也可以發(fā)現(xiàn)求解速度快了很多。

external link

[1] CasADi官網(wǎng)https://web.casadi.org/
[2] 官方案例 Example package
[3] Numerical Optimal Control, 2011, [draft] Moritz Diehl
[4] Yop - an open-source optimal control software 導航頁面
[5] Yop - Github下載頁面：(a MATLAB toolbox)

總結(jié)

以上是生活随笔為你收集整理的最优控制理论八、CasADi求解路径约束轨迹优化的多重打靶法的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。