next up previous
Next: About this document ...

Steepest Descent Method

In the linear system Ax = b suppose the coefficient matrix A is symmetric and positive definite. Solving Ax = b is equivalent to minimizing the quadratic function $Q(x) = {1\over 2} x^{T}Ax - x^{T}b$.


Example: The following linear system of the form Ax = b arises in connection with the Laplace Finite Difference equation on a 3x3 square grid.

\begin{displaymath}
\left[
\begin{array}
{*{9}{r}}
 4_{\ }&-1& 0&-1& 0& 0& 0& 0&...
 ...5 \\  b_6 \\  b_6 \\  b_7 \\  b_8 \\  b_9 \\ \end{array}\right]\end{displaymath}

For this example, the function Q(x) is given by

\begin{displaymath}
Q(x) = {1\over 2} \sum^{9}_{i,j=1}x_{i} a_{ij} x_{j} - \sum^{9}_{i=1}x_{i}b_{i}
.\end{displaymath}


From calculus, we know that if Q(x) has a minimum at some point x, then all the partial derivatives of Q must be zero at that point. Calculating the partial derivatives of Q with respect to each xi, and setting each partial derivative to zero results in the linear system Ax - b = 0 or Ax = b. From this we see that it is correct to view Ax - b as the gradient of Q and write $\nabla Q(x) = Ax - b$.

In the method of steepest descent, we make an initial estimate of the solution x to Ax = b and then proceed down gradients of Q to arrive at the minimum of Q and thus arrive at the solution to Ax = b. In this connection, define the residual ri associated with the approximate x i to be

\begin{displaymath}
r^{i} = b - Ax^{i} = -\nabla Q(x^{i})\end{displaymath}

We define our next estimate to the solution of Ax = b to be down gradient of xi by setting

\begin{displaymath}
x^{i+1} = x^{i} + \alpha _{i}r^{i}.\end{displaymath}

Now it remains to pick a value of $\alpha_{i}$ that determines how far we go in the down gradient direction defined by ri. We choose $\alpha_{i}$ so that we minimize

\begin{displaymath}
F(\alpha ) = Q(x^{i} + \alpha r^{i}).\end{displaymath}

The rest of the steepest descent method follows from calculus: find $ F'(\alpha )$ and set $F'(\alpha ) = 0.$ By the chain rule,

\begin{displaymath}
F'(\alpha ) = \nabla Q(x^{i} + \alpha r^{i})\cdot r^{i}\end{displaymath}

\begin{displaymath}
= \left[\matrix{A(x^{i}&+&\alpha r^{i})&-&b}\right]\cdot r^{i}\end{displaymath}

\begin{displaymath}
= \left[\matrix{ Ax^{i}&+&\alpha Ar^{i}&-&b}\right]\cdot r^{i}\end{displaymath}

\begin{displaymath}
= \left[\matrix{-r^{i}&+&\alpha Ar^{i}}\right]\cdot r^{i}\end{displaymath}

By setting $F'(\alpha ) = 0$,we can see that $\alpha_{i}$ for computing xi+1 is given by

\begin{displaymath}
\alpha_{i} = {r^{i}\cdot r^{i}\over r^{i}\cdot Ar^{i}}\end{displaymath}

The method of steepest descent is easy to program, but it often converges slowly. The main reason for its slow convergence is that the method may spend time minimizing Q along parallel or nearly parallel search directions.



 
next up previous
Next: About this document ...
Dave Zachmann
4/8/1999