EXPLANATION FILE FOR PROGRAM STEEPDA ==================================== Steepest Descent with Approximate Derivatives --------------------------------------------- The method of steepest descent is based on the obtaining of directional information from the gradient. ln the subroutine presented in file Steepds, the gradient was supplied in the form of partial derivatives by the function subroutine. ln many cases, however, such derivatives are not available in ana- lytical form, and so numerical approximations must be used. ln this section, a steepest-descent subroutine that only requires the function itself will be gi- ven; the partial derivatives are estimated using finite difference approxima- tions. We will take as an example the two-dimensional problem of finding the max- imum of z(x,y). The iteration equations are dz/dx x == x + k ---------------------1/2 i+1 i [(dz/dx)² + (dz/dy)²] dz/dy y == y + k ---------------------1/2 i+1 i [(dz/dx)² + (dz/dy)²] The length of the next step in the procedure is k.lf we use the partial deri- vatives that took the iteration from step i - 1 to step i to project ahead by another one-half step, then a new set of derivatives that are evaluated rough- ly in the middle of the next step can be calculated. These derivatives can be employed to move from position (xi, yi) to (xi+1, yi+l). This procedure is outlined mathematically below. We will define the finite difference derivatives calculated at step i to be z(xi + k Dx(i-1)/2,yi) - z(xi,yi) D (i) = --------------------------------- (8.3.1a) x k Dx(i-1)/2 z(xi,yi + k Dy(i-1)/2) - z(xi,yi) D (i) = --------------------------------- (8.3.1b) y k Dy(i-1)/2 The new (x,y) position is then Dx(i) x = x + k ---------------------1/2 (8.3.2a) i+1 i {[Dx(i)]² + [Dy(i)]²} Dy(i) y = y + k ---------------------1/2 (8.3.2b) i+1 i {[Dx(i)]² + [Dy(i)]²} A generalized form for these equations is incorporated into the steepest- descent program STEEPDA. The function examined as an example is the same as that in the program STEEPDS. STEEPDA operates in much the same manner as STEEPDS; the input and output variable formats are identical. Because of the numerical approximation to the derivatives, which is initially crude because of the large step size, the iteration bounces around. However, it soon settles down and homes in on a solution. A comparison of the calculated position of the local maximum and the corresponding true values is given below: Parameter Calculated Value True Value Error --------- ---------------- ---------- ----- xl 1.5707419 1.5707963 -0.0000544 x2 -0.00048635 0.0 0.00048635 x3 4.7121862 4.7123890 -0.0002028 Note that although the maximum value (4) was closely approximated (3.9999998), the calculated location is in considerable error. This was also observed in the previous section, and can be attributed to the partial derivatives being small near the peak. This same problem occurs with the least-squares curve fitting of experimen- tal data. There is a range of values for the calculated coefficients that gi- ves nearly the same standard deviation, and it is difficult for any numerical procedure to find the set of coefficients exactly. Therefore, you should be warned against placing too much faith in the precision of the coefficients calculated. You must also be aware that .the location of a local maximum and not the position of the global optimum may have been found. We will discuss this problem in more detail later. The steepest-descent algorithm is very flexible in terms of the types of optimizations that can be performed. One particularly important category is the minimization of a positive objective function. Two examples--the standard deviation and the min-max fit--are briefly discussed below. The formulation for the standard deviation problem is simple. The function subroutine is used to calculate the variance between the data (or function) to be fitted and the parametric equation to be used for the fitting. Let this va- riance be y. Before returning to the main program from the function subrou- tine, invert y; return l/y. The steepest-descent routine will then attempt to maximize l/y, and thereby minimize the variance (or standard deviation). It is the author's experience that up to three least-squares coefficients can usualIy be reliably determined in this manner. Sometimes the procedure can be extended beyond that. ln particularly difficult situations, it is sometimes necessary to build towards the final set of parameters one step at a time. For exarnple, in polynomial least-squares fittings, the following procedure is recommended: 1) Start by fitting the most important coefficient, say x1, by itself. Use an initial guess somewhere near the expected value, with a step size of k = 0.1. 2) Proceed on to estimating x1 and x2 by starting with the value calculated above for x1, and k = 0.01. 3) Use x1 and x2 from the previous step as initial values, k = 0.001 and find (xI, x2,. x3). 4) And so on. This sequence may require larger or smaller values for k than indicated above. It is usually better to creep up on the (local) optimum using small values for k than to bounce around by starting with a large k. Between each addition of a new parame ter to be fitted, a check should be made as to whether or not any significant progress was made in reducing the variance by the last addition of a parameter (e.g., x3). If not, then either the limit of the procedure was reached, or the iteration homed in on the wrong point. The advantage to using STEEPDA in this mode is that the parameters in the fitting function need not appear in linear combinations as they do in po- lynomial expressions. Therefore, the technique is capable of dealing with com- plicated functions. The optimization can also be with respect to the min-max criterion. ln this case, instead of calculating the variance in the function subroutine, the max- imum positive and negative differences are recorded. Then, y is set equal to the magnitude of the largest of the se two quantities, inverted, and returned to the steepest-descent subroutine. This procedure is identical to least- squares optimization except for the change in the criterion. As an example, consider the min-max polynomial fitting of sin(pi/2)x over the interval -1 <= x <= 1. Using STEEPDA, the single-parameter min-max fit is sin (pi/2) x == 1.22 x for -1 <= x <= 1 (Or, sin x == 0.78x. The maximum error observed in this case is 0.22 (at x = 0 and x = 1). Using x1 = pi/2 and x2 = 1 - pi/2 as a new starting point, a two- parameter approximation is obtained: 3 sin (pi/2) x == 1.S64l x - 0.S72S x for -1 <= x <= 1 The maximum error is reduced to 0.0084. Using the above values and continuing on to three parameters, the fit becomes 3 5 sin (pi/2) x == 1.S732 x - 0.6003 x + 0.0187 x for -1 <= x <= 1 The associated maximum error in the above fit was 0.009. However, this error is greater than that observed for the two-parameter fit! By starting the ite- ration From another position, a much lower error maximum of 0.00014 is achie- ved: 3 5 sin (pi/2) x == l.S706894 x - 0.6432330 x + 0.0725556 x for -1 <= x <= 1 The coefficients in this latter approximation are similar to the min-max values provided by Hastings: 3 5 sin (pi/2) x == l.S706268 x - 0.6432292 x + 0.0727102 x for -1 <= x <= 1 The min-max error for Hastings' approximation is 0.00011; the two fits are nearly equivalent. This example illustra tes the basic difficulty associated with using the steepest-descent method for min-max curve fitting: the iteration may home in on what is early the wrong position. This problem is related to the existence of several soluions to the optimization goal as stated. These solutions can be true analytical maxima, artifacts of the procedure, or artificially created by round-off error. The min-max approximation example given above illustrates the main problem associated with the steepest-descent procedure: it homes in on local maxima. In the case of fitting functions and data by the min-max procedure, it is easy to check the results graphicaIly; we expect to see an equal ripple-error pat- tern. If the objective function is instead the variance or simply the maximum, then the verification of the results is much more difficult. From [BIBLI 01]. --------------------------------------------------- End of file Steepda.txt