A global search algorithm based on the Shepard operator for bang-bang optimal control problem

Page created by Jean Vasquez
 
CONTINUE READING
A global search algorithm based on the
   Shepard operator for bang-bang optimal
               control problem

                       Alexander Gornov , Irina Veyalko
                   Institute for System Dynamics and Control Theory,
                    Siberian Branch of Russian Academy of Sciences,
                       Lermontov Str., 134, 664033 Irkutsk, Russia
                                 {gornov, veyalko}@icc.ru

Abstract. An optimal control search algorithm based on the Shepard approximation is de-
scribed in this paper and computations on some instances are presented.
Keywords: bang-bang control problems, Shepard function.

1. Introduction

    The bang-bang optimal control problems (BBOCP) arise naturally
in many scientific and applied fields. The traditional problem formu-
lations, which are considered in the automatic control theory, are also
the bang-bang problems. The bang-bang type of control is often con-
sidered in the papers, which are devoted to various problems of con-
trol theory (see, for example, [RIO56, LF65, A.A05]). The solutions
of non-singular linear control problems of dynamic systems optimiza-
tion without state constraints are guaranteed to be bang-bang solutions
(see [LF65]). In the case when the control is scalar and the num-
bers of switch points are a priori known, the optimal control search
can be reduced by the trivial method to a finite-dimensional problem
of small dimension. Apparently, the bang-bang optimal control prop-

Studia Informatica Universalis.
92    Studia Informatica Universalis.

erty can considerable simplify search of optimal functional value, be-
cause the cardinality of controls is much less. In the most of studies
the methods of local search are only studied for optimal control prob-
lems. However, both problem statement and its mathematical formu-
lation of optimization problems require the search of the global op-
timal solution. The direct discretization of the problem leads to the
problem of mathematical programming approximation with hundreds
or thousands of variables, where even the local search can be a serious
challenge for the most modern computer systems, and a global search
may be unpractical and may require astronomical amounts of compu-
tation. There are a lot of publications which deal with developing of
global search algorithms based on genetic techniques (see, for example,
works of H. Seywald, R.R. Kumar, M. Shima, M.H. Lee, K.S. Chang, Z.
Michalewicz, J.B. Krawczyk, S. Smith, N.V. Dakev, A.J. Chipperfield,
P.J. Fleming, H. Muhlenbein, D. Schlierkamp-Voosen, B. de Andres-
Toro, J.A. Lopez-Orozco, J. Bobbin, V. Subramaniam, H. Pohlheim, A.
Heibner, M.M.A. Hashem, K. Watanabe, K. Izumi and others). The
dissertation of IL Lopez Cruz [I.L02] seems the most consistent in this
regard, there are solutions of three nonlinear test instances in this work.
The lack of published results of computational experiments and collec-
tions of nonconvex optimal control problems shows the inefficiency of
this approach. Among a few classic publications, which are devoted
to nonconvex nonlinear optimal control problems, it should be noted
the works of N.N. Moiseev [N.N75], R. Bellman [R.60], V.F. Kro-
tov [V.F96], among recent works, it can be specified the publications
[TA08, AT09, A.Y10, ATssa, T.Sss, ATssb]. The problem of imple-
menting effective global search methods is still a topical problem.
   In this paper we address an algorithm based on the Shepard approx-
imation. The Shepard approximating functions were introduced over
forty years ago (see [A.Y98]). In our opinion, the Shepard function has
some nice properties, but this function has been unknown to most ex-
perts and it is not widely addressed in the literature. The Shepard func-
tion is the ratio of two rational functions; it is a smooth, infinitely differ-
entiable function, it is an interpolant in the simplest case. The Shepard
function can be constructed easily for a function of any dimension and
for any distribution of points where we have experimental information
(“at an irregular grid”). To calculate the approximation function at any
Shepard algorithm   93

point it is necessary to make only 4k(n+1) arithmetic operations, where
n is the number of variables, k is the number of grid points (interpola-
tion nodes). The main disadvantage of the Shepard function is the low
approximation accuracy, it is the “zero-order of accuracy”. Various au-
thors have made several attempts to construct a new type of approxima-
tor based on a combination of Shepard operator with other interpolants,
which are more accurate local interpolants. A comprehensive survey
can be found in [RF07].
   The algorithm of optimal control search based on the Shepard ap-
proximation is developed and presented in this paper.

2. Problem statement

   Given a control process described by a ordinary differential equa-
tions system ẋ = f (x(t), u(t), t), with initial conditions x(t0 ) = x0 , de-
fined at the interval T = [t0 , t1 ], where t is a independent variable (usu-
ally time), x(t) is a n-dimensional vector of phase coordinates, u(t) is
a r-dimensional vector of controls. The n-dimensional vector-function
f (x(t), u(t), t) is continuously differentiable. The initial vector x0 is
given. The control u(t) ∈ {u, u}, where u, u are upper and lower
bounds of control. The optimal control problem is to find the control
vector u(t), which minimizes the functional I(u(t)) = ϕ(x(t1 )). The
function ϕ(x(t1 )) is continuously differentiable.

3. Generation of the auxiliary controls

   It is possible to obtain some information about the location of the
global extremum in optimal control problems with terminal functional
by a almost complete scan of the reachable set. To solve this problem
some approaches have been developed in the phase estimation theory.
These approaches are based on the stochastic approximation idea (see
[D.68, A.Y09]). Some algorithms are proposed to construct a set of
test controls that covers the feasible parallelepiped. These algorithms
implement piecewise linear random functions, “spline”, and bang-bang
random functions with the fixed number of switch points. Among the
known approaches, those were good in computational practice, only a
94     Studia Informatica Universalis.

method based on bang-bang functions can be applied for a BBOCP. The
number of switch points is fixed in the algorithms of control generating.
It is a disadvantage of previously proposed approaches. This property
can lead to inefficiency of the algorithm of the reachable set approxi-
mation, because the optimal number of switch points is not known be-
forehand. If the number of switch points is smaller than the optimal
number, then the generated control is selected from the class of con-
trols, which may not have the optimal solution. If the number of switch
points is bigger than the optimal number, then we have a “concentra-
tion effect” of switch points in a small neighborhood of the reachable
set. This effect also affects the quality of the approximation and can
lead to that there are no points, that get to the area of the global ex-
tremum. In our variant of algorithm the number of switch points is
random and mathematical expectation of this number at the sequence
of generating controls is equal to a predetermined value. This value is
corrected by the algorithmic parameter NT P . The switching threshold
is calculated, its size depends on the recommended number of switch
points. A random number is chosen from the interval [0, 1] at each step
of the control discretization. If this number exceeds the boundary value
then switch is modeled at a given point. The routine URAND is used to
generate quasi-random numbers (see [GMC80]). The algorithm of con-
trol generation is described below for the scalar control (r = 1). The
generalization of the algorithm for the vector case can be trivially done.
     The algorithm 1. The generation of the auxiliary controls.
     1) Set the algorithmic parameters:
       - NU is the number of discretization grid node;
       - NT P is the expectation of switching points number.
    2) Compute the discretization grid τi with uniform step h = (t1 −
t0 )/(NU − 1) such, that τ0 = t0 , τNU = t1 , τi < τi+1 , i = 1, NU − 1.
    3) Generate a random number s ∈ [0, 1].
    4) If s ≤ 0.5, then u(t0 ) = u, else u(t0 ) = u.
    5) Calculate the switching threshold P = 1 − NT P /(NU − 1).
    6) For all i = 1, NU − 1:
        a) Generate a new random number s ∈ [0, 1].
Shepard algorithm    95

      b) If s < P , then u(τi+1 ) = u(τi ), else, if u(τi ) = u, then
u(τi+1 ) = u, if u(τi ) = u, then u(τi+1 ) = u.
  Our computational study confirmed the ability of the algorithm to
generate good internal approximations of the reachable set.

4. Basic algorithm

    If we have
          ! k a sufficiently
                     "       large and representative set of the admissible
controls u (t), I , k = 1, K, where K is the number of elements, I k
                   k

is the value of objective functional for the control uk (t), then it is pos-
sible to estimate the functional value for any admissible control up (t),
even if accuracy is low. It is obviously, that the more points-controls
will be located in the base set and they will be located more uniform,
the estimation will be better . In accordance with the Shepard methodol-
ogy, the functional value I s at the new point is estimated by the formula
                                K
                                           Ik
                                #
                                                  4
                                   $up (t)−uk (t)$
                         I S = k=1
                                K
                                                    .                       (1)
                                           1
                               #
                                                  4
                               k=1 $u (t)−u (t)$
                                     p       k

                 $
                  %t1
Here $u(t)$ =            u2 (t)dt, up (t) is a new control, which was gen-
                    t0
erated by the algorithm of generation of auxiliary controls and uk (p)
is a one of control from the set of admissible controls. If estima-
tion of I s is! worse,
                 "     then the record value of the objective functional
I r = min I k , which is known to the present moment. If we have
       k=1,K
                                     !      "
the sufficient representative set of uk (t) , k = 1, K, there is a high
probability that the considered control up (t) is not able to improve the
record value and is not perspective. In this case, there is no need to
waste CPU time to obtain the exact value of the functional at this con-
trol, and we can just ignore this control. On the other hand, studying
the considered control can increase the “degree of representativeness”
of the base set. This degree is the distance to the nearest element of the
set. εU is a priori defined algorithmic parameter reflecting the allowed
96     Studia Informatica Universalis.

                                        &               &
“distance” between the controls. If min &up (t) − uk (t)& > εU , then it
                                         k=1,K
is obviously, that one should take the time to integrate the system with
this control and include it into the base set. The various combinations of
the two strategies, which correspond to these two situations, determine
the versions of the proposed method of optimal control search.
     The algorithm 2. Optimal control search.
     1) Set algorithmic parameters:
       - NU is the number of discretization grid nodes;
       - NT P is the expectation of switch points number;
       - KS is the initial number of test controls;
       - K is the total number of test controls;
       - εϕ is the accuracy of integration;
       - εU is the estimation of the “distance” between the controls.
    2) The initial base set of controls BU0 is empty, the number of controls
in the basis MB0 = 0.
    3) The record minimal functional value is infinitely large (I R = ∞),
the record maximum value of I M = −∞.
    4) For all j = 1, KS :
        a) Generate the test control uj (t) for algorithmic parameters NU
and NT P .
        b) Integrate the system of the differential equation for control
uj (t), save the final phase vector xj (t1 ).
        c) Calculate the value of objective functional I j = ϕ(xj (t1 )) for
the test control.
        d) If I j < I R − $ϕ , then set the record minimal value I R = I j .
        e) If I j > I M , then set the record maximum value I M = I j .
        f) Include the test control to the control base: BUj = BUj−1 ∪
{uj (t)}, MUj = MUj−1 + 1.
    5) For all j = KS + 1, K:
     a) Generate the test control uj (t) for algorithmic parameters NU
and NT P .
Shepard algorithm   97

       b) Calculate the estimation of the “distance”
                                                & between the
                                                            & test con-
trol and control from the base set ∆j = min &uj (t) − uk (t)&.
                                              k=1,K
             j
      c) If ∆ > εU , then go to step 5.g.
      d) Calculate the “degree of representativeness” of control uj (t)
according to the Shepard formula:
                                 j
                                MB
                                #     (I k −I R )/(I M −I R )
                                                         4
                                k=1    $uj (t)−uk (t)$
                     δj = 1 −     j
                                 MB
                                                1
                                  #
                                                         4
                                 k=1 $u (t)−u (t)$
                                       j     k

       e) Generate a random number s ∈ [0, 1].
       f) If s < δ j , then k = k + 1 and go to step 5.a.
       g) Integrate the system of the differential equation for the control
u (t), save the final phase vector xj (t1 ).
  j

       h) Calculate the value of objective functional I j = ϕ(xj (t1 )) for
the test control.
       i) If I j < I R − εϕ , then set the record minimal value I R = I j .
       j) If I j > I M , then set the record maximum value I M = I j .
       k) Include the test control to the control base set: BUj = BUj−1 ∪
{uj (t)}, MUj = MUj−1 + 1.

5. Computational experiments

   The checking of effectiveness of this algorithm was performed on
the collection of instances (see [ATTIss]). In this section we present
obtained solutions to some of the problem instances. Computational
experiments were carried out on a PC with Intel Core2 Duo, 2.1GHz
and 2GB of RAM .
   Instance 1.

                   ẋ1 = sin x1 + sin x2 − u − 0.5x22 ;
                    ẋ2 = − cos x1 − cos x2 + 0.9x21 ,
                     t ∈ [0, 1], x0 = (1, 0), |u| = 1,
98     Studia Informatica Universalis.

                                3x1 (t1 )                    1
                 I(u) =    2x21 (t1 )+x22 (t1 )
                                                  −   11x1 (t1 )x2 (t1 )
                                                                           → min .

     This problem has one extremum.

Figure 1: The optimal trajectories and control,the reachable set and the
extremum point of instance 1

              Table 1: Computational results on instance 1
      The functional value      The extremum point (x1 , x2 )
            0.11284                   (0.06473, -1.0739)
      The number of Cauchy problems: 2204        CPU time: 16 sec.

     Instance 2.

                       ẋ1 = −2x1 + 1.5x2 − u + 2x22 ;
                      ẋ2 = 0.5x2 + x1 + 1.7u + 0.3x21 ;
                    t ∈ [0, 1], x0 = (0, −1), u ∈ {0, 2} ,
                I(u) = x2 (t1 )+x2 (t1 ) 1     x (t )
                                                      → min .
                            1           2   −cos x1 (t1 )·cos     2√ 1
                                  200                               2

    The modified Griewank function is used in this instance
(see [D.C03]). This instance has two extrema, one of them is global
and one is local. Only global extremum has been found by our algo-
rithm and it is presented on the figure 2 and in the table 2.
     Instance 3.

                                ẋ1 = x√2 + x1 sin x1 + u;
                                 ẋ2 = 2.1 − u · cos x2 ,
Shepard algorithm   99

Figure 2: The optimal trajectories and control,the reachable set and the
extremum point of instance 2

            Table 2: Computational results on instance 2
    The functional value      The extremum point (x1 , x2 )
          0.48372                  (1.27126, 2.37082)
    The number of Cauchy problems: 1433        CPU time: 12 sec.

                  t ∈ [0, 4], x0 = (3, 0), |u| = 1,
           I(u) = −(x1 (t1 ) − 5)2 − (x2 (t1 ) − 6)2 → min .

  This instance has two extrema, one of them is global and one is local.
The global extremum was found by the Shepard algorithm.

Figure 3: The optimal trajectories and control,the reachable set and the
extremum point of instance 3

   Instance 4.

                    ẋ1 = x2 + 5.32 sin x22 − 5.32u;
100    Studia Informatica Universalis.

              Table 3: Computational results on instance 3
      The functional value      The extremum point (x1 , x2 )
           -25.34166                (10.03323, 6.09100)
      The number of Cauchy problems: 1112        CPU time: 3 sec.

                    ẋ2 = 2x2 + 5.44 cos(x1 x2 ) + 0.88u,
                        t ∈ [0, 1], x0 = (1, 0), |u| = 1,
                                   1
                    I(u) = x1 (t1 +x 2 (t1 )
                                             − x21 (t1 ) → min .

   The next instance has three extrema, one of them is global. It has
been found by the Shepard algorithm.

Figure 4: The optimal trajectories and control,the reachable set and the
extremum point of instance 4

        Table 4: Results of numerical solving the test instance 4
      The functional value      The extremum point (x1 , x2 )
          -112.58524                 (10.65128, 9.15598)
      The number of Cauchy problems: 2524        CPU time: 6 sec.

   Instance 5.

                       ẋ1 = 0.5x62 +x1 + 0.8x22 − 0.12u;
                        ẋ2 = 2.01x1 sin x2 + 1.15x1 u,
                       t ∈ [0, 2], x0 = (0, 0), |u| = 1,
                            I(u) = x2 (t1 ) → min .
Shepard algorithm   101

   The instance has two extrema. The global extremum has been found
by the proposed algorithm.

Figure 5: The optimal trajectories and control,the reachable set and the
extremum point of instance 5

            Table 5: Computational results on instance 5
     The functional value     The extremum point (x1 , x2 )
          -0.40062                (0.30296, -0.40062)
     The number of Cauchy problems: 1223       CPU time: 3 sec.

6. Conclusion

    Our work experience with algorithms of global search is not too rich,
however it allows us to make a conclusion that the proposed method can
significantly reduce (in several times) the computational efforts saving
the time by the fact that unpromising controls are not investigated. One
can also expect an even greater time acceleration of calculations of the
problems with the “difficult” Cauchy problems, such as problems with
stiff systems of differential equations. Of course, the proposed algo-
rithm is heuristic and can not guarantee the finding of the global ex-
tremum for any BBOCP. The accuracy of the “record” algorithm solu-
tion is not very high, apparently, it is connected with the uniform dis-
cretization grid and that the approximations of the reachable set in the
optimum are not too good. These disadvantages of the method is not
fatal, since it is not the main task of functionality of the method. Its
purpose is to promptly get the approximations included into the domain
102    Studia Informatica Universalis.

of attraction of the global extremum. To obtain high-precision solutions
it is necessary to combine the proposed algorithm with the local opti-
mization algorithms that are able not only to refine the extremum, but
also to verify the quality of the solution, using procedures of necessary
optimality conditions.

Acknowledgements

   This work was supported by Russian Foundation of Basic Research (grant No. 09-
07-00267 and No. 10-01-00595).

References
[A.A05]     Tolstonogov A.A. The principle of bang-bang for subdif-
            ferential controled systems. Dynamical systems and control
            problems, Proceedings, Proc. IMM, 11(1):189–200, 2005.
[AT09]      Gornov A.Yu. and Zarodnuk T.S. The curved-line search
            method of global extremum for optimal control problem.
            Modern technologies. System analysis. Modeling., pages 19–
            26, 2009.
[ATssa]     Gornov A.Yu. and Zarodnuk T.S. Method of random sur-
            faces for optimal control problem. Computational technolo-
            gies, in press.
[ATssb]     Gornov A.Yu. and Zarodnyuk T.S. The tunnel-type al-
            gorithm for solving non-convex optimal control problems.
            Journal of Global Optimization, in press.
[ATTIss] Gornov A.Yu., Zarodnyuk T.S., Madjara T.I., and Veialko
         I.A. Test collection of nonlinear non-convex optimal control
         problems. Journal of Global Optimization, in press.
[A.Y98]     Gornov A.Yu. On a class of algorithms for constructing in-
            ternal estimates of reachable set. In DIC-98. Proceedings of
            the Int. Workshop, pages 10–12, 1998.
[A.Y09]     Gornov A.Yu. Computational technology for solving of op-
            timal control problems. Novosibirsk: Nauka, 2009.
Shepard algorithm   103

[A.Y10] Gornov A.Yu. Optimal control problem: computing tech-
        nologies for finding a global extremum. In Abstracts of the
        International Conference on Optimization, Simulation and
        Control, page 75, 2010.
[D.68]  Shepard D. A two-dimensional interpolation function for
        irregularly-spaced data. In Proceedings of the 1968 ACM
        National Conference, pages 517–524. ACM Press, New
        York, 1968.
[D.C03] Clemente D.C. Hierarchical optimization of virtual rigid
        body spacecraft formations. Thesis. Maryland: Aerospace
        Engineering Department, University of Maryland, 2003.
[GMC80] Forsythe G.E., Malcolm M.A., and Moler C.B. Computer
        methods for mathematical computations. Moscow: Mir,
        1980.
[I.L02] Lopez Cruz I.L. Efficient evolutionary algorithms for opti-
        mal control. PhD-Thesis, June 2002.
[LF65]  Sonneborn L. and Van Vleck F. The bang-bang principle for
        linear control systems. SIAM J. Control, (2):151–159, 1965.
[N.N75] Moiseev N.N. Elements of optimal systems theory. Moscow:
        Nauka, 1975.
[R.60]  Bellman R. Dynamic programming. Moscow: Foreign Lit-
        erature, 1960.
[RF07]  Caira R. and Dell’Accio F. Shepard-bernoulli operators.
        Mathematics of computation, 76(257):299–321, 2007.
[RIO56] Bellman R., Olicksberg I., and Gross O. On the bang-
        bang control problem. Quarterly of Applied Mathematics,
        (14):11–18, 1956.
[TA08]  Zarodnuk T.S. and Gornov A.Yu. The search technology of
        global extremum for optimal control problem. Modern tech-
        nologies. System analysis. Modeling., pages 70–76, 2008.
[T.Sss] Zarodnuk T.S. Algorithm of the numerical solution of mul-
        tiextremal optimal control problems with parallelepiped re-
        strictions. Computational technologies, in press.
[V.F96] Krotov V.F. Global methods in optimal control theory. N.Y.:
        Marcel Dekker Inc., 1996.
104   Studia Informatica Universalis.
You can also read