x0 + span(p0,p1,p2,...,pi)
where the pk represent previous search directions. An added advantage to this approach is that, if we can select the pk to be linearly independent, then the dimension of the hyperplane
x0 + span(p0,p1,p2,...,pi)
will grow one dimension with each iteration of the conjugate gradient method. This would imply that (assuming infinite precision arithmetic) the solution of the linear system Ax=b would be obtained in no more than N steps, where N is the number of unknowns in the system.
Let x0 be an initial estimate to the solution of Ax=b. For our first search direction we proceed down a Q-gradient and choose
![]()
where
![]()
From the discussion of the method of steepest descent, we have
![]()
To understand what follows in the description of the conjugate gradient method, it is important to note that
![]()
Rather than try to establish the above orthogonality relationships with a
calculation, use the following calculus argument. By definition, r1 is the
gradient of Q at x1, where x1 is the conjugate gradient estimate to follow
the initial guess x0 . Also, r0 = p0 is the search direction along the
line
. Calculus tells us that
the gradient of Q at x1 must be orthogonal to the search direction.
Not as a proof of the last statement about calculus and orthogonality, but to motivate the statement, consider
the layers of an onion to be surfaces of Q = constant. Imagine piercing the onion with a skewer. In general,
the skewer will pass through several outer layers of the onion, tangentially touch one of the inner layers, again
pass through the other layers and then exit. The innermost layer that the skewer touches tangentially is given
by
and the direction of the skewer is r0 = p0.
What does your mental image of this skewered onion tell you about the orientation of r1 and r0 = p0?
The conjugate gradient method calls for defining successive approximates by
![]()
![]()
with the scheme for choosing
and
to be discussed next.
The things to keep in mind when choosing
and
are:
1.We want the span of the search directions to fill the space we are searching as the number of iterations increases;
2.Searching down Q-gradients was basically a good idea. But, to guarantee linearly independent successive search directions, we generally need to choose conjugate gradient search directions to be perturbations of steepest descent search directions.
We have already seen how to define p0 and
, so x1 and
r1 = b - Ax1 can be considered known. To take the next step using
the conjugate gradient method, we must determine values for
and
so that we can calculate p1 and x2.
Then we will see a pattern emerge.
In taking this next step in the conjugate gradient method we are seeking to minimize Q over the plane
x0 + span(p0,p1)
This means that the residual r2 will have to be orthogonal to both p0 and p1. (To support this last claim, imagine slicing through a lobe of an onion with a knife, forming a flat area spanned by vectors p0 and p1.)
The orthogonality condition
will help us set the
search direction p1.
![]()
which is zero provided
![]()
/noindent Definition: Two vectors u and v are said to be A-conjugate if
.
From the requirement that
, it follows that the search
direction p1 must be A-conjugate to the
search direction p0. We can now set
as follows:
![]()
implies
![]()
Now
![]()
implies
![]()
Having decided to proceed from x1 to x2 along the search direction
defined by
, the same calculus
argument used to determine gives
![]()
so a step of the conjugate gradient method is complete. Having developed the CG method for one step it is easy to see that successive iterates are defined as follows.
Conjugate Gradient Algorithm (for A symmetric and positive definite)
Step 1. (initialize)
Choose
x0
Set
i = 0 and imax = maximum number of iteration to be performed
r0 = b - Ax0
p0 = r0
Step 2: (Begin CG iteration)
If i < imax, Perform Steps 3 - 5
If i = imax, Exit and print message "i = imax"
Step 3: (Perform CG update)
Set
![]()
Step 4: (check for convergence)
If
go to Step 6.
Step 5: (Prepare for next CG update)
Set
xi = xi+1
ri = ri+1
i = i +1
Go to Step 3
Step 6:
print solution and residual norm