next up previous
Next: About this document ... Up: No Title Previous: No Title

Conjugate Gradient Method

The conjugate gradient method is based on the idea that the convergence to the solution could be accelerated if we minimize Q over the hyperplane that contains all previous search directions, instead of minimizing Q over just the line that points down gradient. To determine xi+1 we minimize Q over

x0 + span(p0,p1,p2,...,pi)

where the pk represent previous search directions. An added advantage to this approach is that, if we can select the pk to be linearly independent, then the dimension of the hyperplane

x0 + span(p0,p1,p2,...,pi)

will grow one dimension with each iteration of the conjugate gradient method. This would imply that (assuming infinite precision arithmetic) the solution of the linear system Ax=b would be obtained in no more than N steps, where N is the number of unknowns in the system.

Let x0 be an initial estimate to the solution of Ax=b. For our first search direction we proceed down a Q-gradient and choose

\begin{displaymath}
x^{1} = x^{0} + \alpha_{0} p^{0} \end{displaymath}

where

\begin{displaymath}
p^{0} = r^{0} = - \nabla Q(x^{0}) = b - Ax\end{displaymath}

From the discussion of the method of steepest descent, we have

\begin{displaymath}
\alpha_{0} = {{r^{0} \cdot r^{0}} \over {r^{0} \cdot Ar^{0}}} = {{r^{0} \cdot p^{0} } \over {p^{0} \cdot Ap^{0}}} \end{displaymath}

To understand what follows in the description of the conjugate gradient method, it is important to note that

\begin{displaymath}
r^{1} \cdot r^{0} = r^{1} \cdot p^{0} = 0\end{displaymath}

Rather than try to establish the above orthogonality relationships with a calculation, use the following calculus argument. By definition, r1 is the gradient of Q at x1, where x1 is the conjugate gradient estimate to follow the initial guess x0 . Also, r0 = p0 is the search direction along the line $x^{0} + \alpha_{0} p^{0}$. Calculus tells us that the gradient of Q at x1 must be orthogonal to the search direction.

Not as a proof of the last statement about calculus and orthogonality, but to motivate the statement, consider the layers of an onion to be surfaces of Q = constant. Imagine piercing the onion with a skewer. In general, the skewer will pass through several outer layers of the onion, tangentially touch one of the inner layers, again pass through the other layers and then exit. The innermost layer that the skewer touches tangentially is given by $\nabla Q(x^{1}) = x^{1}$ and the direction of the skewer is r0 = p0.

What does your mental image of this skewered onion tell you about the orientation of r1 and r0 = p0?

The conjugate gradient method calls for defining successive approximates by

\begin{displaymath}
x^{i+1} = x^{i} + \alpha_{i} p^{i} \end{displaymath}

\begin{displaymath}
p^{i+1} = r^{i+1} + \beta_{i} p^{i} \end{displaymath}

with the scheme for choosing $\alpha_{i}$ and $\beta_{i}$ to be discussed next. The things to keep in mind when choosing $\alpha_{i}$ and $\beta_{i}$ are:

1.We want the span of the search directions to fill the space we are searching as the number of iterations increases;


2.Searching down Q-gradients was basically a good idea. But, to guarantee linearly independent successive search directions, we generally need to choose conjugate gradient search directions to be perturbations of steepest descent search directions.

We have already seen how to define p0 and $\alpha_{0}$ , so x1 and r1 = b - Ax1 can be considered known. To take the next step using the conjugate gradient method, we must determine values for $\alpha_{0}$and $\beta_{1}$ so that we can calculate p1 and x2. Then we will see a pattern emerge.

In taking this next step in the conjugate gradient method we are seeking to minimize Q over the plane

x0 + span(p0,p1)

This means that the residual r2 will have to be orthogonal to both p0 and p1. (To support this last claim, imagine slicing through a lobe of an onion with a knife, forming a flat area spanned by vectors p0 and p1.)

The orthogonality condition $ p^{0} \cdot r^{2} = 0$ will help us set the search direction p1.

\begin{displaymath}
p^{0} \cdot r^{2} = p^{0} \cdot \left[ b - A \left( x^{1} \right) \right]
= p^{0} \cdot r^{1} - \alpha_{1} p^{0} \cdot Ap^{1}\end{displaymath}

which is zero provided

\begin{displaymath}
\alpha_{1} p^{0} \cdot Ap^{1} = 0\end{displaymath}

/noindent Definition: Two vectors u and v are said to be A-conjugate if $u \cdot Av = 0$.

From the requirement that $ p^{0} \cdot r^{2} = 0$ , it follows that the search direction p1 must be A-conjugate to the search direction p0. We can now set $\beta_{0}$ as follows:

\begin{displaymath}
p^{1} = r^{1} + \beta_{0} p^{0} \end{displaymath}

implies

\begin{displaymath}
Ap^{1} = Ar^{1} + \beta_{0} Ap^{0}\end{displaymath}

Now

\begin{displaymath}
0 = p^{0} \cdot Ap^{1} = p^{0} \cdot Ar^{1} + \beta_{0} p^{0} \cdot Ap^{0}\end{displaymath}

implies

\begin{displaymath}
\beta_{0} = - {p^{0} \cdot Ar^{1} \over p^{0} \cdot Ap^{0}} = - {r^{1} \cdot Ar^{1} \over p^{0} \cdot Ap^{0}}\end{displaymath}

Having decided to proceed from x1 to x2 along the search direction defined by $p^{1} = r^{1} + \beta_{0} p^{0}$, the same calculus argument used to determine gives

\begin{displaymath}
\alpha_{1} = {{r^{1} \cdot p^{1}} \over {p^{1} \cdot Ap^{1}}}\end{displaymath}

so a step of the conjugate gradient method is complete. Having developed the CG method for one step it is easy to see that successive iterates are defined as follows.

Conjugate Gradient Algorithm (for A symmetric and positive definite)

Step 1. (initialize)
Choose
x0

Set
i = 0 and imax = maximum number of iteration to be performed
r0 = b - Ax0
p0 = r0


Step 2: (Begin CG iteration)
If i < imax, Perform Steps 3 - 5
If i = imax, Exit and print message "i = imax"


Step 3: (Perform CG update)
Set
$\alpha_{i} = {{r^{i} \cdot p^{i}} \over {p^{i} \cdot Ap^{i}}}$
$x^{i+1} = x^{i} + \alpha_{i} p^{i} $
$r^{i+1} = r^{i} - \alpha_{i} Ap^{i} $


Step 4: (check for convergence)
If $ \Vert[ r^{i+1} \Vert < tolerance $ go to Step 6.


Step 5: (Prepare for next CG update)
Set
xi = xi+1
ri = ri+1
$\beta_i = - {{r^{i} \cdot Ar^{i}} \over {p^{i} \cdot Ap^{i}}}$
$p^{i} = r^{i} + \beta_{i} p^{i}$
i = i +1
Go to Step 3


Step 6:
print solution and residual norm


next up previous
Next: About this document ... Up: No Title Previous: No Title
Dave Zachmann
2/5/1999