The Gradient Method


 

Contents:

 

Optimization Procedures

The goal of the Analyst is to help the Investor "do what is best". Their joint objective should be to make a set of investment decisions that will provide the maximum possible utility for the Investor. In some cases this can be formalized as a problem involving the maximization of an objective function (such as the utility of a portfolio for the Investor) subject to one or more constraints (such as those imposed by the Investor's level of wealth). In the investment arena the process of solving such a problem is often termed optimization. Procedures for efficiently determining optimal strategies are frequently called optimization algorithms. Not surprisingly, computer programs that use such procedures are generally described as optimizers.

The sections that follow deal with classes of optimization problems frequently encountered by Analysts and methods that can be employed to solve such problems.

 

The Standard Asset Allocation Problem

We focus on the allocation of an Investor's assets among several asset classes to as to maximize the utility of the resulting portfolio of assets for the Investor, taking into account the Investor's risk tolerance and relevant constraints on asset holdings. Many Analysts approach such problems using one-period estimates of asset risks, correlations and expected returns, assuming that the Investor's utility is a function of the expected return and standard deviation of return of the selected portfolio. More precisely, the Investor's utility is represented as a linear function of the mean and variance of the portfolio of assets:

u = ep - vp/rt

where:

u = the utility of the portfolio for the Investor
ep = the expected return of the portfolio
vp = the variance of the portfolio return
rt = the Investor's risk tolerance

Here, rt represents the investor's marginal rate of substitution of variance for expected value. Moreover, u, the measure of portfolio utility, can be interpreted as a risk-adjusted expected return, since it is computed by subtracting a risk penalty (vp/rt) from the expected return (ep).

The expected return of a portfolio will, of course, depend on its composition and on the expected returns of its components. A portfolio is represented by a vector of holdings, expressed as proportionate values. Let x be an {n*1} vector of such proportions and e be an {n*1} vector of asset expected returns. Then the expected return on portfolio x will be:

ep = x'*e

The variance of a portfolio's return will depend on its composition and on the covariances among the various asset classes. Let C be an {n*n} matrix of such covariances. Then the variance of return for portfolio x will be:

vp = x'*C*x

The goal is to find the best portfolio -- here, the one with the maximum possibility utility. The decision variables are the asset holdings -- that is, the elements of vector x. As these are varied, the utility of the associated portfolio will change. We wish to vary them until the maximum possible utility is attained. However, there are typically constraints on the allowable combinations. In the standard problem the x values represent proportionate holdings of assets. In such a case only values of x that sum to 1.0 may be considered. We thus must obey a full-investment constraint:

sum(x) = 1

To generalize slightly in anticipation of more complex problems, we will require that:

sum(x) = k

where k is a constant.

Often there will be further constraints on asset holdings. In many situations short sales are precluded, hence only non-negative values of the x's are allowed. Upper limits may also apply. The standard problem includes both types of bounds. Let lb be a {n*1} vector of lower bounds and ub an {n*1} vector of upper bounds. Then we require that each value of x(i) must be below or at its upper bound ub(i) and above or equal to its lower bound lb(i). In vector terms:

x <= ub
x >= lb

More succinctly:

lb <= x <= ub

Cases involving only lower bounds can be treated by assigning values of plus infinity to upper bounds, while those involving only upper bounds can be treated by assigning values of minus infinity to lower bounds. If there are no bounds, both procedures can be invoked. This makes the standard problem formulation more general than one might initially assume.

We will refer to this as the "standard asset allocation problem". To summarize, the goal is to:

Select:
x
to maximize:
u = ep - vp/rt
 
where:
ep = x'*e
vp = x'*C*x
 
subject to:
sum(x) = k
lb <= x <= ub

Note that this involves the maximization of a quadratic function of the decision variables, subject to a set of linear constraints, some of which are inequalities. A problem with such characteristics is termed a quadratic programming (QP) problem. It may be solved with a general quadratic programming algorithm or with a procedure designed to deal only with problems that have similar structures. Here we introduce an algorithm that can solve the standard asset allocation problem in a simple and intuitive way. While somewhat limited in its range of application, it is easy to program and illustrates key economic principles that apply to a very broad range of optimization problems in Macro-investment Analysis.

 

A Three-Asset Example

To illustrate the steps in optimization procedures, we use a simple example involving three assets -- cash, bonds and stocks. The standard deviations and correlations among their returns are similar to those of real returns on diversified index mutual funds during the period from 1980 through 1995. Expected returns are similar to mean real returns over that period. All monthly values are annualized. It is important to note that the period in question involved very high average returns. Unbiased estimates of future expected returns would in all likelihood be considerably lower.

It is convenient to include all the key asset-related information in one block. Here are the inputs for our example formatted for use in the optimization worksheet provided for solving such problems:

              MIN     INIT      MAX   ExpRet  StdDev   c:cash  c:bonds c:stocks
cash         0.00     1.00     1.00     2.80    1.00     1.00     0.40     0.15
bonds        0.00     0.00     1.00     6.30    7.40     0.40     1.00     0.35
stocks       0.00     0.00     1.00    10.80   15.40     0.15     0.35     1.00

The first column shows the lower bounds (all zero in this case) and the third column the upper bounds (all 1.00). The second column indicates the initial portfolio. In this case all of its assets are invested in cash. The fourth and fifth columns show the expected returns and standard deviations of the assets, respectively, stated in terms of percent return per year (for example, stock is expected to return 10.80% per year). The final columns provide estimates of the correlations among the asset classes.

In matrix terms, the problem inputs are:

e =
    2.80
    6.30
   10.80

sd =
    1.00
    7.40
   15.40

cc =
    1.00    0.40    0.15
    0.40    1.00    0.35
    0.15    0.35    1.00

lbd =
     0.00
     0.00
     0.00

ubd =
     1.00
     1.00
     1.00
x0 =
     1.00
     0.00
     0.00

For computational purposes we need the covariance matrix C:

C = (sd*sd').*cc
C =
    1.000    2.960    2.310
    2.960   54.760   39.886
    2.310   39.886  237.160

Two other inputs are needed. The first is k, the required sum of the values in the x vector. Here we compute it from the initial portfolio, so that the sum of the x values is required to be the same as currently :

k = sum(x0)

k = 1

The other input is the Investor's risk tolerance. In this case we assume that it is 50 -- a value representing a moderate attitude towards risk-taking.

rt = 50

 

The Utility Hill

Although our example involves three decision variables (cash, bonds and stocks), the full-investment constraint restricts portfolios to combinations that sum to one. Thus we can characterize the problem as one of choosing, say, proportions to be invested in bonds and stocks, with any remaining amount invested in cash. This makes it possible to graph the relationship between two of the decision variables and the measure of merit. The resulting surface will have some of the attributes of a hill. However, only a portion of this "utility hill"is feasible. We must restrict our search to coordinates in which the sum of the amounts invested in bonds and stocks is 1.0 or less.

The diagram below shows the feasible portion of the utility hill over which we can conduct our exploration.

 

 

Note that the highest feasible point involves somewhat more investment in stocks than in bonds, and no investment in cash. Note also that it provides the Investor with considerably more utility (over 6.0%) than the initial all-cash portfolio shown at the bottom of the hill, which provides less than 3.0% in utility.

We have called this surface a utility hill for a reason. It resembles at least a portion of a hill or mountain. In a sense, our job is to climb to the highest feasible point on the hill. We will do this in stages. We start with a feasible portfolio. Then we find the feasible direction in which we can move upward at the greatest rate. More specifically, we select the direction that will result in the greatest increase in altitude (utility) per step (change in portfolio holdings) -- that is, the steepest gradient. Having selecting a direction, we climb until we either reach a peak or a boundary that we cannot cross. Then we determine the feasible direction of steepest ascent again and repeat the process. When no feasible direction leads upward, we stop. Given the nature of the terrain in a standard problem, this procedure will place us on the highest allowable point -- that is, provide the portfolio with the greatest possible utility.

 

Asset Marginal Utility

Consider the effect of a small change in the holding of one asset on the portfolio's utility. Recall that:

u = ep - vp / rt

Let mu(i) be the marginal utility of asset i -- the derivative of u with respect to x(i):

mu(i) = d u / d (x(i)

This will be related to the marginal expected return of asset i:

dep/dx(i)

and its marginal risk:

dvp/dx(i)

as follows:

mu(i) = dep/dx(i) - (1/rt)*dvp/dx(i)

But we know that:

dep/dx(i) = e(i)

and

dvp/dx(i) = the i'th row of 2*C*x

where C is the asset covariance matrix.

Given this, we can compute mu, the vector containing the marginal utilities of all the assets directly:

mu = e - (1/rt)*2*C*x

where x is the current portfolio.

For the example we begin with the current portfolio (x) equal to x0. The resulting marginal utilities are:

mu =
    2.7600
    6.1816
   10.7076

Thus utility will change at a rate of 10.7076% per unit change in the amount invested in asset 3, as long as the change in the latter is small. The rate of change will be 6.1816% for asset 2 and 2.7600% for asset 1.

 

The Optimal Feasible Swap

Marginal utilities provide important information about a portfolio -- information that can be used to improve it --to alter its composition so as to increase its utility for the Investor in question.

In this case all the marginal utilities are positive, indicating that more of any asset would increase utility. It would be lovely if we could increase the proportions invested in all three assets or, better, yet, only the best of the three. But such is not feasible. The only changes that can be considered are those that meet the constraint that the sum of the proportions equal k. Since this is already the case for the current portfolio, we must restrict our choices to changes in which the sum of the amounts by which we increase one or more assets equals the sum of the amounts by which we decrease one or more of the others.

In this case the most attractive asset to increase is the third (stocks). Happily, it can be increased, since the current amount (0.00) is below the upper bound (1.00). The least attractive asset to increase is the first (cash). But it is also the most attractive asset to decrease. This suggests a two-asset swap in which asset 3 is increased and asset 1 decreased. Such a swap would increase utility at the rate of 10.7076-2.7600 or 7.9476% per unit amount of the swap, as long as the change in the latter is small. Fortunately, this particular swap is feasible, since (1) the asset to be increased (stock) is currently below its upper bound and (2) the current value of the asset to be decreased (cash) is 1.00, well above its lower bound of 0.00.

We conclude that if a small two-asset swap is to be undertaken, the best possibility involves a decrease in the amount of cash, with an equivalent increase in the amount invested in stocks. If the current portfolio is actually held, this requires the sale of cash securities, with the proceeds used to buy stocks. In the more likely event that the calculations are being made to determine an optimal portfolio to be held, the "buys" and "sells" would be hypothetical. For simplicity, however, we use the terms "sell", "buy" and "swap" to describe both cases.

In a problem of this type, a swap is only feasible if the asset to be decreased is above its lower bound and the asset to be increased is below its upper bound. Moreover, if the current portfolio satisfies the constraint that sum of the holdings equals k, so will any portfolio resulting from such a swap, as long as the magnitude of the swap is not too large. Since (by definition) there are no additional constraints in a standard problem, these conditions are both necessary and sufficient for a swap to be feasible. These observations lead to the rules for finding the optimal feasible two-security swap:

For all securities i for which x(i)>lb(i), find the one with the smallest value of mu(i).

Let this asset be isell and its marginal utility mu(isell)

For all securities i for which x(i)<ub(i), find the one with the largest value of mu(i).

Let this asset be ibuy and its marginal utility mu(ibuy)

The optimal two-security swap is then to sell isell and buy ibuy

This will increase utility at the rate: mu(ibuy)-mu(isell)

Clearly, if mu(ibuy)-mu(isell) is positive, the portfolio's utility can be increased. Moreover, no other small change can increase it by a larger amount. Why? Because (1) any other set of purchases will prove inferior to the purchase of ibuy, since ibuy's marginal utility is greater than that of any other potential feasible purchase or set of purchases and (2) any other set of sales will prove inferior to the sale since isell's marginal utility is smaller than that of any other potential feasible sale or set of sales. The optimal two-security swap is thus the optimal swap in a standard problem.

Special cases require slightly more general rules. If two or more assets have the same marginal utility, the choice of isell or ibuy may not be unique and an arbitrary choice may be made. If all securities are at their lower bounds or at their upper bounds, no improvement is possible. This is also the case if the mu(ibuy) equals mu(isell) -- a condition that proves central for both optimization and an understanding of equilibrium in financial markets.

 

The Optimal Feasible Amount to Swap

The optimal feasible swap will increase utility at the greatest possible rate per unit swapped. But the rate of increase will change as the size of the swap increases. At some point, utility will reach its peak, then decline. Moreover, the feasible amount of a swap will be limited by the upper bound on the asset being purchased and by the lower bound on the asset being sold. All these factors need to be taken into account in order to find the optimal feasible size for any desired swap

For generality, we represent a swap by a vector s of changes in asset holdings, where the sum of the elements is zero. In our example, the optimal feasible swap is:

s =
    -1
     0
     1

Now, let a represent the amount swapped, so that the net effect is to change the portfolio by an amount equal to s*a.

For example, if a = 0.10:

s*a =
    -0.10
     0.00
     0.10

Let cx denote this set of changes:

cx = s*a

Then if such changes are made to portfolio x, the result will be a new portfolio xx, given by:

xx = x + cx

or

xx = x + s*a

Now, consider the utilities of portfolios x and xx:

u(x) = x'*e - (1/rt)* x'*C*x

u(xx) = (x' + cx')*e - (1/rt)*(x'+cx')*C*(x+cx)

We are interested in the change in utility cu=u(xx)-u(x). Expanding the second formula and subtracting the first gives:

cu = cx'*e - (1/rt) ( 2*x'*C*cx + cx'*C*cx);

Substituting s*a for cx and simplifying gives:

cu = [s'*e]*a - (1/rt) * ( [ 2*x'*C*s]*a + [s'*C*s]*(a^2))

Rearranging terms to express cu as a function of the amount of the swap, we obtain:

cu = [ s'*e - (1/rt)*2*s'*C*x]*a - [(s'*C*s)/rt]*(a^2)

or

cu = k0*a - k1*(a^2)

where:

k0 = s'*(e - (1/rt)*2*C*x)

k1 = (s'*C*s)/rt

In our example:

k0 = 7.9476

k1 = 4.6708

The figure below plots cu as a function of a for this case.

 

Note that the change in utility increases at a decreasing rate. This is not surprising, since the function is quadratic with a negative quadratic term (-k1). This is a characteristic of all changes in portfolio composition. Given the nature of covariance matrices the s'*C*s will be positive for any set of changes represented by vector s. As long as the investor's risk tolerance is positive, k1 will be also, so that -k1 will be negative. We thus conclude that:

Portfolio revision is subject to decreasing returns to scale. The greater the magnitude of a revision, the smaller will be the rate of further increase in portfolio utility -- that is, utility will increase at a decreasing rate.

Note that k0, the first term in this expression, is equal to the net marginal utility of the swap. This is not surprising, since the marginal utility measures the effect on utility of an infinitesimal change in holdings.

 In this case, the maximum possible change in utility is obtained by swapping a large amount of the portfolio. The actual amount may be calculated directly. We seek the value of a at which cu is maximized. Since cu must have a positive slope at the origin and the slope must decrease with a, we need only set the derivative equal to zero:

dcu/da = k0 - 2*k1*a = 0

or

a = k0 / ( 2*k1 )

In this case:

a = 7.9476 / (2 * 4.6708) = 0.8508

Thus the optimal amount to swap is 0.8508.

Of course, this calculation does not take into account the upper and lower bounds on the holdings. To remain feasible, we cannot buy an amount of ibuy that will make x(ibuy) exceed ub(ibuy). Nor can we sell an amount of isell that will make x(isell) fall below lb(isell). The optimal feasible amount to swap (aopt) is thus:

a = min( [ k0 / (2*k1), ub(ibuy) - x(ibuy), x(isell) - lb(isell) ] )

In our case:

a = 0.8508

cu = k0*a - k1*(a^2) = 3.3808

So the optimal amount of the swap will add 3.3808 to the utility of the portfolio.

 

An Algorithm for the Standard Problem

We now have all the ingredients needed to solve a standard problem. Starting with any feasible portfolio, we can find the optimal feasible swap and the optimal amount of that swap. After making a swap in the appropriate amount we have a new (and improved) feasible portfolio. But this may also be improved, using the same technique. By repeating the process until no further improvement is possible, we reach the goal of maximizing utility without violating the stated constraints.

The central part of the algorithm uses a loop that continues until a terminal condition is fulfilled. This can be written somewhat inelegantly in pseudo-MATLAB as:

while 1==1
   [ do computations ] 
   if [ finished ]
      return
   end; 
end;

To begin, of course, an initial feasible portfolio is required. Assuming that x0 meets this requirement, we precede the loop with:

    % set initial mix and number of assets
      x = x0;
      n = length(x);

This also sets n as the number of assets for later use.

Inside the loop, the first computations are those that compute the marginal utilities and find the best asset to buy and the best to sell. For expository purposes we do this in a somewhat inefficient manner, as follows:

 

    % compute marginal utilities
        mu =  e - (1/rt)*2*C*x;
    % find best assets to buy and sell
        ibuy  =  0;
        mubuy = -1E200;
        isell =  0;
        musell = 1E200;
        for i = 1:n
           if x(i) < ub(i)  % possible buy
              if mu(i) > mubuy
                 mubuy = mu(i);
                 ibuy = i;
              end;
           end;
           if x(i) > lb(i)  % possible sell
              if mu(i) < musell
                 musell = mu(i);
                 isell = i;
              end;
           end;
        end;
 

This results in a large negative value of mubuy if no assets may be bought and a large positive value of musell if no assets may be sold. Otherwise, mubuy and musell are the marginal utilities of ibuy and isell, respectively, the best assets to buy and sell.

At this point it is time to check to see if the procedure should be terminated. If the net rate of change in marginal utility associated with the optimal feasible swap is zero or negative, no more improvement is possible. Given the nature of computed values, it is desirable to stop when this amount is less than some minimum threshold. For example:

    % terminate if change in mu is less than threshold value
        if (mubuy - musell) <= 0.0001
           return
        end

In the optimization worksheet this constant is termed the marginal utility cutoff. If speed is more important than accuracy, it should be set to a relatively large value (for example, .001). If accuracy is more important than speed, it should be set to a relatively small value (e.g. 0.00001).

Note the procedure for termination also covers cases in which no assets remain to be purchased and/or sold, since we have set the values of musell and mubuy to be very large and very small, respectively, in such cases. A bit devious, but effective.

If the termination condition has not been met, it is time to make a swap. First, we set up the vector describing the optimal swap:

    % set up swap vector
        s = zeros(n,1);
        s(ibuy) = 1;
        s(isell) = -1;

  Then we compute the optimal amount to swap without considering the effects of asset bounds:

    % compute optimal amount of swap without regard to asset bounds
       k0 = s'*(e - (1/rt)*2*C*x);
       k1 = (s'*C*s)/rt;
       a = k0/(2*k1);

This may or may not be feasible. If necessary, the amount to be swapped is reduced to avoid violating the upper bound of the asset to be bought or the lower bound of the asset to be sold:

    % reduce amount if required to keep ibuy from exceeding its upper bound
       if a > (ub(ibuy) - x(ibuy))
         a = ub(ibuy) - x(ibuy);
       end;
    % reduce amount if required to keep isell from falling below its lower bound
       if a > (x(isell) - lb(isell))
         a = x(isell) - lb(isell);
       end;

To avoid the possibility of an infinite loop, it is wise to check for a zero amount and terminate if such is encountered:

    % terminate if amount is zero
       if a == 0
          return;  
       end

Finally (!) it is time to revise the portfolio to improve it as much as possible and be in a position to repeat the process all over again:

    % change mix
       x = x + ( s*a) ;

Here is the entire algorithm set up as a MATLAB function that takes rt, e, C, lb, ub and x0 as inputs and returns the optimal portfolio x as an output.

function x = gmqp(rt,e,C,lb,ub,x0);
%  determines solution to a standard optimization problem
%  usage:
%    x = gmqp(rt,e,C,lb,ub,x0);
%      rt =  Investor risk tolerance
%      e  = {n*1} vector of asset expected returns
%      C  = {n*n} return covariance matrix
%      lb = {n*1} vector of asset lower bounds
%      ub = {n*1} vector of asset upper bounds
%      x0 = {n*1} vector of initial feasible asset mix 

  % set initial mix and number of assets
     x = x0;
     n = length(x);
  while 1==1;
    % compute marginal utilities
        mu =  e - (1/rt)*2*C*x;
    % find best assets to buy and sell
        ibuy  =  0;
        mubuy = -1E200;
        isell =  0;
        musell = 1E200;
        for i = 1:n
           if x(i) < ub(i)  % possible buy
              if mu(i) > mubuy
                 mubuy = mu(i);
                 ibuy = i;
              end;
           end;
           if x(i) > lb(i)  % possible sell
              if mu(i) < musell
                 musell = mu(i);
                 isell = i;
              end;
           end;
        end;
    % terminate if change in mu is less than threshold value
        if (mubuy - musell) <= 0.0001
           return
        end
    % set up swap vector
        s = zeros(n,1);
        s(ibuy) = 1;
        s(isell) = -1;
    % compute optimal amount of swap without regard to asset bounds
       k0 = s'*(e - (1/rt)*2*C*x);
       k1 = (s'*C*s)/rt;
       a = k0 / (2*k1);
    % reduce amount if required to keep ibuy from exceeding its upper bound
       if a > (ub(ibuy) - x(ibuy))
         a = ub(ibuy) - x(ibuy);
       end;
    % reduce amount if required to keep isell from falling below its lower bound
       if a > (x(isell) - lb(isell))
         a = x(isell) - lb(isell);
       end;
    % terminate if amount is zero
       if a == 0
          return;  
       end
    % change mix
       x = x + ( s*a) ;
  end;

With this function in a directory in the MATLAB path under the name gmqp.m, you could obtain the solution to a standard problem by simply giving the following command at the command line or in a program:

x = gmqp(rt,e,C,lb,ub,x0)

For our problem:

x =
         0
    0.3996
    0.6004

Thus the optimal asset mix (portfolio) contains no cash and has roughly 40% invested in bonds and 60% in stocks.

 

Function GQP

While function gmqp suffices for most standard problems, it cannot handle cases in which rt=0, which would result in attempts to divided portfolio variance by zero. However, economically meaningful problems arise in which the goal is to minimize variance, as would an Investor with zero tolerance for risk. Such cases can be handled by changing the units in which utility is measured. Instead of:

u = ep - vp/rt

We can use:

uv = rt*ep - vp

The more commonly-used version (u) divides portfolio variance by rt, the marginal rate of substitution of variance for expected return, to convert vp to an expected-return equivalent. The latter is then subtracted from portfolio expected return to give u, a measure of portfolio utility in expected return equivalent terms.

The second measure (uv) multiplies portfolio expected return by rt, the marginal rate of substitution of variance for expected return, to convert ep to a variance equivalent. The portfolio variance is then subtracted to obtain a measure of portfolio utility stated in variance-equivalent terms. When rt=0, maximizing uv is equivalent to minimizing portfolio variance, as desired. When rt is greater than zero, maximizing uv will give the same mix (x) as will maximizing u.

This alteration is incorporated in MATLAB function gqp, that can be used instead of gmqp. Function gqp has a number of additional features. It uses more efficient vector methods to find the optimal swap and the appropriate amount of that swap and is thus faster than gmqp. It also computes the expected return and variance of the portfolio and returns them as additional outputs if requested to do so.

With function gqp in a directory in the MATLAB path under the name gqp.m, you could obtain the solution to a standard problem by simply giving the following command at the command line or in a program:

[x,ep,vp]  = qqp(rt,e,C,lb,ub,x0)

 

The Optimization Worksheet

For those without access to MATLAB, all is not lost. The optimization worksheet is a javascript implementation of the gradient algorithm.The format for inputs follows that given in the section above. In addition, the Investor's risk tolerance and the marginal utility cutoff must be specified. The outputs obtained from the worksheet using the inputs shown earlier for an Investor with a risk tolerance of 50 are:

PORTFOLIOS:
             Initial   Optimal    Change
cash           1.000     0.000    -1.000
bonds          0.000     0.400     0.400
stocks         0.000     0.600     0.600

CHARACTERISTICS:
             Initial   Optimal    Change
ExpRet         2.800     9.002     6.202
StdDev         1.000    10.648     9.648
Utility        2.780     6.734     3.954

Not  surprisingly, the optimal portfolio composition is that obtained earlier (to three decimal places). Also shown are the expected returns, standard deviations and utilities of the intial and optimal portfolios as well as the changes in all the variables.