There are many assumptions on the relative change rates of different term structures that are key to DVO1-style metrics and hedges and multifactor metrics and hedges. We begin by focusing on hedging that is single-variable with respect to regression analysis. Next, we move to multiple regression two-factor hedging and finally the empirical description of rates’ joint movement across a curve by principal component analysis. Basically what we expect to see in this chapter is that empirical relationships are not static and that a one-time period estimation of hedges will prove to be inaccurate over subsequent periods.

# Single-Variable Regression-Based Hedging

When indexing of principal outstanding inflation amounts of Treasury Inflation Protected Securities (\(TIPS\)) regularly, the \(TIPS\) create the real or inflation-adjusted payments and, as a result, \(TIPS\) investors require a rate of return that is relatively low.

Contrary to that, a real rate of return, expected inflation compensation and a risk premium of inflation are the requirements by the US treasury bonds investors. The implication is that market views about inflation are reflected by \(TIPS\) and rates of spreads between nominal bonds.

Consider a trader who wishes to invest a certain amount of money on several bonds with a given representative yields and DVO1s. Due to the protection of its cash flows from inflation, \(TIPS\) will sell at relatively low yield or high prices,but its DVO1 will be relatively high because of low yield. However, the question is the face amount the \(TIPS\) should be bought for the trade to be hedged against the interest rates level. This can be achieved by making the trade DVO1-neutral.

This hedge will make sure that the trade will not make or lose money should the \(TIPS\) and the nominal bonds both increase or decrease by a similar number of basis points. A change in the number of basis points in the \(TIPS\) yield does not show a unique change in the normal yield, with high confidence, or an average change of a similar number of basis points. This implies a lack of one to one relationship in yield and thereby calling to question the DVO1 hedge. There is very little a trader can do on the nominal yield’s dispersion of the change, for a given change in the real yield so as to improve the DVO1 hedge.

# Least-Squares Regression Analysis

Let:

$$ \Delta { y }_{ t }^{ N }=\alpha +\beta \Delta { y }_{ t }^{ R }+{ \varepsilon }_{ t } $$

Where; \(\Delta { y }_{ t }^{ N }\) is the change in the yields of nominal bonds and \(\Delta { y }_{ t }^{ R }\) represents the change in the real bonds, intercept \(\alpha \) and slope \(\beta \) are estimated from the data while \({ \varepsilon }_{ t }\) is the error term due to the nominal yield change deviation from the model’s predicted change on a given time period.

It is a requirement of the Least-Squares estimation for the model to describe the dynamics in question accurately,and the errors tobe independent, uncorrelated with the independent variable while having the same probability distribution.

The realized error on that particular given period (with the nominal yield changing by a given basis points,e.g. 5.5) is defined as:

$$ { \hat { \varepsilon } }_{ t }=\Delta { y }_{ t }^{ N }-\hat { \alpha } -\hat { \beta } \Delta { y }_{ t }^{ R } $$

$$ =\Delta { y }_{ t }^{ N }-=\Delta { \hat { y } }_{ t }^{ N } $$

The estimates of \(\hat { \beta } \) and \(\hat { ∝ } \) which minimizes the sum of squares of the error terms realized are determined by the least squares estimation of \(\beta\) and \(\alpha \), where:

$$ { \Sigma }_{ t }{ \hat { \varepsilon } }_{ t }={ \Sigma }_{ t }{ \left( \Delta { y }_{ t }^{ N }-\hat { \alpha } -\hat { \beta } \Delta { y }_{ t }^{ R } \right) }^{ 2 } $$

Offsetting positive and negative errors are considered unacceptable zero errors hence the need for squaring errors. Also, compared to smaller errors, large errors in absolute values are substantially punished.

# The Regression Hedge

The face amounts of real and nominal bonds denoted by \({ F }^{ R }\) and \({ F }^{ N }\) and their \(DVO1s\) denoted by \({ DVO1 }^{ R }\) and \({ DVO1 }^{ N }\) and the regression-based hedge can be written as follows:

$$ { F }^{ R }=-{ F }^{ N }\times \frac { { DVO1 }^{ N } }{ { DVO1 }^{ R } } \times \hat { \beta } \quad \quad \quad \quad \quad \left( I \right) $$

However, this regression hedge has a shorter justification. Therefore, a hedged position’s profit and loss (P&L) is:

$$ { -F }^{ R }\times \frac { { DVO1 }^{ R } }{ 100 } \Delta { y }_{ t }^{ R }-{ F }^{ N }\times \frac { { DVO1 }^{ N } }{ 100 } \Delta { y }_{ t }^{ N }\quad \quad \quad \quad \left( II \right) $$

In a given data, it is observed that the hedge in equation (\(I\)) minimizes the variance of the P&L in equation (\(II\)) and can be used to estimate some other regression parameters.

The regression framework for hedging automatically provides the hedged portfolio’s volatility estimates if the P&L expression in equation (\(II\)) is substituted by \({ F }^{ R }\) from equation (\(I\)) and the terms rearranged to get the following hedged position’s P&L expression:

$$ -{ F }^{ N }\times \frac { { DVO1 }^{ N } }{ 100 } \left( \Delta { y }_{ t }^{ N }-\hat { \beta } \Delta { y }_{ t }^{ R } \right) $$

It is observed that:

$$ \left( \Delta { y }_{ t }^{ N }-\hat { \beta } \Delta { y }_{ t }^{ R } \right) ={ \hat { \varepsilon } }_{ t }+\hat { \alpha } $$

The standard deviation of \( \left( \Delta { y }_{ t }^{ N }-\hat { \beta } \Delta { y }_{ t }^{ R } \right) \) can be estimated by the regression error \(\hat { \sigma } \). Therefore:

\(\Rightarrow { F }^{ N }\times \frac { { DVO1 }^{ N } }{ 100 } \times \hat { \sigma } \approx \) P&L standard daviation

# The Stability of Regression Coefficients over Time

As described by the equation: \(\Delta { y }_{ t }^{ N }=\alpha +\beta \Delta { y }_{ t }^{ R }+{ \varepsilon }_{ t }\), the errors around the regression line might be random outcomes of a stable relationship or manifestations of changing relationships hence posing a challenge in using regression based hedging.

Observing if the results of the estimates of a regression’s coefficients over different time periods are stable or not is an important start when considering their stability. This implies that a hedger can continue using a previously estimated \(\hat { \beta } \) for hedging or re-estimate the hedge coefficient with a dataset that is more recent if it is available or from past data with the time period that is more relevant.

# Two-Variable Regression-Based Hedging

Consider a market maker in EUR interest rate swaps receiving or purchasing from a client relatively liquid 20-year swaps that are fixed and the resulting interest rate exposure needs hedging. The market maker will sell a combination of 10- and 30-year swaps since an immediate sell or pay of the 20-year swaps will be too costly of the client paid the spread.

To describe the relationship between changes in the 20-year swap rates and changes in the 10- and 30-year swap rates, the marker will rely on a two-variable regression model:

$$ \Delta { y }_{ t }^{ 20 }=\alpha +{ \beta }^{ 10 }\Delta { y }_{ t }^{ 10 }+{ \beta }^{ 30 }\Delta { y }_{ t }^{ 30 }+{ \varepsilon }_{ t } $$

The above equation can be estimated to the single variable case by minimalizing With respect to parameters \(\hat { ∝ } \) ,\({ \hat { \beta } }^{ 10 }\) , \({ \hat { \beta } }^{ 30 }\).

$$ \sum _{ t }^{ }{ { \left( \Delta { y }_{ t }^{ 20 }-\hat { ∝ } -{ \hat { \beta } }^{ 10 }\Delta { y }_{ t }^{ 10 }-{ \hat { \beta } }^{ 30 }\Delta { y }_{ t }^{ 30 } \right) }^{ 2 } } $$

A predicted 20-year swap rate is provided by the estimation of the aforementioned parameters. Thus:

$$ \Delta { \hat { y } }_{ t }^{ 20 }=\hat { \alpha } +{ \hat { \beta } }^{ 10 }\Delta { y }_{ t }^{ 10 }+{ \hat { \beta } }^{ 30 }\Delta { y }_{ t }^{ 30 }\quad \quad \quad \quad \quad \left( a \right) $$

The P&L of the hedged position is given as:

$$ -{ F }^{ 20 }\frac { { DVO1 }^{ 20 } }{ 100 } \Delta { y }_{ t }^{ 20 }-{ F }^{ 10 }\frac { { DVO1 }^{ 10 } }{ 100 } \Delta { y }_{ t }^{ 10 }-{ F }^{ 30 }\frac { { DVO1 }^{ 30 } }{ 100 } \Delta { y }_{ t }^{ 30 }\quad \quad \quad \quad \left( b \right) $$

The expression derives notational face amounts of the 10- and 30-year swaps (\({ F }^{ 10 }\) and \({ F }^{ 30 }\)) required to hedge \({ F }^{ 20 }\) face amount of the 20-year swaps.

Retaining the terms depending on \(\Delta { y }_{ t }^{ 10 }\) and \(\Delta { y }_{ t }^{ 20 }\) while substituting the 20-year rate predicted changes from equation (\(a\)) to (\(b\)) we obtain:

$$ \left[ -{ F }^{ 20 }\frac { { DVO1 }^{ 20 } }{ 100 } { \hat { \beta } }^{ 10 }-{ F }^{ 10 }\frac { { DVO1 }^{ 10 } }{ 100 } \right] \Delta { y }_{ t }^{ 10 }+\left[ -{ F }^{ 20 }\frac { { DVO1 }^{ 20 } }{ 100 } { \hat { \beta } }^{ 30 }-{ F }^{ 30 }\frac { { DVO1 }^{ 30 } }{ 100 } \right] \Delta { y }_{ t }^{ 30 } $$

The predicted P&L dependence on the 10- and 30-year rates will be eliminated by setting \({ F }^{ 10 }\) and \({ F }^{ 30 }\) to have a zero in the brackets. Thus:

$$ { F }^{ 10 }=-{ F }^{ 20 }\frac { { DVO1 }^{ 20 } }{ { DVO1 }^{ 10 } } { \hat { \beta } }^{ 10 }\quad \quad \quad \quad \left( c \right) $$

$$ { F }^{ 30 }=-{ F }^{ 20 }\frac { { DVO1 }^{ 20 } }{ { DVO1 }^{ 30 } } { \hat { \beta } }^{ 30 }\quad \quad \quad \quad \left( d \right) $$

The \(DVO1\) risk in the 10-and 30-year parts of the hedge can be both expressed as a \(DVO1\) risk fraction of the 20-year. Rearranging (\(c\)) and (\(d\)) in the following order, can help determine the risk weights mathematically.

$$ \frac { -{ F }^{ 10 }\times { DVO1 }^{ 10 } }{ { F }^{ 20 }\times { DVO1 }^{ 20 } } ={ \hat { \beta } }^{ 30 } $$

$$ \frac { -{ F }^{ 30 }\times { DVO1 }^{ 30 } }{ { F }^{ 20 }\times { DVO1 }^{ 20 } } ={ \hat { \beta } }^{ 30 } $$

# Level versus Change Regressions

The level-on-level regression with variables \(x\) and \(y\) in the single-variable case can be relayed mathematically as:

$$ { y }_{ t }= ∝ +\beta { x }_{ t }+\varepsilon _{ t }\quad \quad \quad \quad \quad \left( e \right) $$

And the change-on-change regression is:

$$ { y }_{ t }-{ y }_{ t-1 }=\Delta { y }_{ t }=\beta \Delta { x }_{ t }+\Delta { \varepsilon }_{ t }\quad \quad \quad \quad \quad \left( f \right) $$

The estimated coefficients of (\(e\)) and (\(f\)) are likely to be inefficient since both the error terms of (\(e\)) and (\(f\)) are likely to be correlated over time. Moreover, a sensible way to model the relationship between two bonds than either (\(e\)) or (\(f\)) is by modeling the behavior that the yield of y-bond will move closer from say 1% to 5%. Assume (\(e\)) with error dynamics:

$$ { \varepsilon }_{ t }=\rho { \varepsilon }_{ t-1 }+{ v }_{ t },\quad { \forall }_{ \rho }<1 $$

# Principal Components Analysis (PCA)

## Overview

Consider a set of swap rates at annual maturities from 1 to 30 years. The rates’ variances and their pair wise correlations or covariances is a good method of describing the rates’ time series fluctuations. Creating 30 interest rate factors or components with the change in each of the 30 rates is described by each factor is another way the data can be described. The following properties are applied by PCAs to set up the 30 factors:

- The sum of the individual rates’ variances equals the sum of the variances of the PCs.
- There is un-correlation between the PCs.
- Given all earlier PCs, the selecting of each PC is such that they have maximum possible variance subject to the above two properties.

# PCAs for USD Swap Rates

Current economic conditions determine changes in short-term rates hence more volatility in short-term rates as compared to longer time rates which are relatively less volatile as a result of being determined by expectations of future economic conditions. However, the volatility of very short-term rates is significantly dampened due to actions by Board of Governors of Federal Reserve systems. The level factor level factor explains the vast majority of movement of term structures and reflects this behavior on the central banks’ part. The movement of the longer-term rates is less than intermediate and shorter-term rates with the original effect prevailing at longer maturities.

# Hedging with PCA and Application to Butterfly Weights

To long securities of intermediate maturity, butterfly trades will use three securities and short the wings or short the intermediate security and long the wings. To construct empirically based hedges for large portfolios, PCAs are crucial. In this section, we show the usefulness of PCAs in hedging butterfly trades.

A good example of taking a relatively common butterfly is a trader assuming that the 5-year swap rate is high relative to the 2- and 10-year swap rates thus planning to receive in the 5-year and paying in the 2- and 10-year. Should he receive on 100 notional 5-year swap rates amount and trade \({ F }^{ 2 }\) and \({ F }^{ 10 }\) notional amount of 2- and 10-year swaps. \({ F }^{ 2 }\) and \({ F }^{ 10 }\) can be solved in terms of risk weights relative \(DVO1\) of the 5-year swap.

# Principal Component Analysis of EUR, GBP and JPY Swap Rates

If a graph of the PCs for EUR, GBP,and JPY swap rates are to be drawn with Basic points being represented by the y-axis,andthe x-axis represents the terms, over the same time period, of USD PCs represented by another graph, then the shape of the PCs will be the same across USD, EUR,and GBP. However, they will differ significantly in latitudes as the USD level component will entail much larger-sized moves as compared to the EUR and GBP components. For the JPY curve, its PC will be similar to those of the other countries,but its component level lacks the same hump.

# The Shape of PCs over Time

Deciding the relevant time period over which to estimate parameters is important for PCs. Stability in quantitative shapes for PCs has maintained its position remarkably until very recently. However the estimated differences in PCs over different periods of time should not be ignored assuming that their effect on hedges’ quality is on importance.

If a graph of three computed USD PCs is to be drawn over a period say from 2001 to 2008, then the qualitative shapes of these PCs will be similar. However, the rates of volatility will be observed to change over the time and in the process change the magnitude of the PC curves.

# The Least-Squares Hedge Minimizes the Variance of the P&L of the Hedged Position

Recall that for a hedged position, the P&L is:

$$ -{ F }^{ R }\times \frac { { DVO1 }^{ R } }{ 100 } \Delta { y }_{ t }^{ R }-{ F }^{ N }\times \frac { { DVO1 }^{ N } }{ 100 } \Delta { y }_{ t }^{ N }\quad \quad \quad \left( 1 \right) $$

If the variance and covariance functions are denoted by \(V\left( ˖ \right) \) and \(Cov\left( ˖,˖ \right) \) respectively, then the variance of the P&L of the above equation is:

$$ { \left( { F }^{ R }\times \frac { { DVO1 }^{ R } }{ 100 } \right) }^{ 2 }V\left( \Delta { y }_{ t }^{ R } \right) +{ \left( { F }^{ N }\times \frac { { DVO1 }^{ N } }{ 100 } \right) }^{ 2 }V\left( \Delta { y }_{ t }^{ N } \right) +2 { \left( { F }^{ R }\times \frac { { DVO1 }^{ R } }{ 100 } \right) }{ \left( { F }^{ N }\times \frac { { DVO1 }^{ N } }{ 100 } \right) }Cov\left( \Delta { y }_{ t }^{ R },\Delta { y }_{ t }^{ N } \right) \quad \quad \left( 2 \right) $$

Differentiating equation (\(2\)) with respect to \({ F }^{ R }\) and the result set to zero enables us to find the face amount \({ F }^{ R }\) which will minimize the variance. Hence:

$$ 2{ F }^{ R }{ \left( \frac { { DVO1 }^{ N } }{ 100 } \right) }^{ 2 }V\left( \Delta { y }_{ t }^{ R } \right) +2{ F }^{ N }\frac { { DVO1 }^{ R } }{ 100 } \frac { { DVO1 }^{ N } }{ 100 } Cov\left( \Delta { y }_{ t }^{ R },\Delta { y }_{ t }^{ N } \right) =0 $$

Rearranging the terms, we get:

$$ { F }^{ N }\times { DVO1 }^{ N }\times \frac { Cov\left( \Delta { y }_{ t }^{ R },\Delta { y }_{ t }^{ N } \right) }{ V\left( \Delta { y }_{ t }^{ R } \right) } =-{ F }^{ R }\times { DVO1 }^{ R } $$

We can substitute: \(\frac { Cov\left( \Delta { y }_{ t }^{ R },\Delta { y }_{ t }^{ N } \right) }{ V\left( \Delta { y }_{ t }^{ R } \right) } =\hat { \beta } \) into the above equation to get the regression hedging rule

$$ { F }^{ R }=-{ F }^{ N }\times \frac { { DVO1 }^{ N } }{ { DVO1 }^{ R } } \times \hat { \beta } $$

# Constructing Principal Components from Three Rates

A variance-covariance matrix \(V\) can usefully combine on correlations and volatilities with the \({ i }^{ th }\) row and the \({ j }^{ th }\) column elements give the covariance of the rate of term \(i\) with term \(j\) rates. It also gives the correlation of \(i\) and \(j\) multiplied by the standard deviation \(i\) multiplied by the standard deviation of \(j\).

An importance of the variance-covariance matrix is succinctly a particular portfolio’s variance of relevant securities. The basic idea of the principal components is creating three factors to capture the same information as the variance-covariance matrix.

This can be achieved by denoting the first principal component by a vector \(g\): \(g=\left( { g }_{ 1 },{ g }_{ 2 },{ g }_{ 3 } \right) \)’. Maximizing \(g’Vg\) such that \(g’g = 1\) will determine the elements of the vector. The constraint \(g’g=1\) with other constraints ensures the total variance of the underlying data and that of the PCs are equal.

To determine the second principal component \(h=\left( { h }_{ 1 },{ h }_{ 2 },{ h }_{ 3 } \right) \), \(h’Vh\) is minimized such that \(h’h=1\) and \(h’g=0\). The condition is that the PC \(g\) is uncorrelated with the PC \(h\).

The third PC \(k=\left( { k }_{ 1 },{ k }_{ 2 },{ k }_{ 3 } \right) \) will be determined by solving three equations \(k’k = 1\); \(k’g = 0\); and \(k’h = 0\).

The following implications are crucial for this scaling:

- The PCs are uncorrelated by construction.
- To determine the variance of each PC, the sum of its squares of its elements/volatilities and find the the square roots of the sum of the squares.
- The rates’ variances sum is the square root of the PCs’ sum of the variances.
- To determine the volatility of any portfolio, its volatility is computed based on each PC then finding the resulting variance’s sum square root.

# Practice Questions

1) A trader at a large bank plans to short $100 million of the \(3\frac { 2 }{ 7 } \)s of \({ 20 }^{ th }\) February 2019 and purchase some amount of the \(TIPS\) \(2\frac { 3 }{ 7 } \) s of \({ 20 }^{ th }\) January 2019 against that assuming that the nominal yield in the data changes by 1.044 basis points per basis-point change in the real yield. The yields and \(DVO1\)s of a \(TIPS\) and a nominal US Treasury as of \({ 30 }^{ th }\) April 2015 are provided as follows:

$$ \begin{array}{|c|c|c|c|c|} \hline Bond & DVO1 \\ \hline 2\frac { 3 }{ 7 } s \quad of \quad { 20 }^{ th } January \quad 2019 & 0.078 \\ \hline 3\frac { 2 }{ 7 } s \quad of \quad { 20 }^{ th } February \quad 2019 & 0.052 \\ \hline \end{array} $$

Compute the \(TIPS\) face amount that should be purchased for the trade to be hedged against the interest rate levels.

- $69.9 million
- $156.6 million
- $150 million
- $66.7 million

The correct answer is **A**.

This can be achieved by making the trade \(DVO1\)-neutral.

The trader has to buy the \({ F }^{ R }\) face amount of the \(TIPS\) such that:

$$ { F }^{ R }\times \frac { 0.078 }{ 100 } =$100\quad million\quad \times \frac { 0.052 }{ 100 } \times 1.044 $$

$$ \Rightarrow { F }^{ R }=$100\quad million\quad \times \frac { 0.052 }{ 100 } \times 1.044 $$

$$ =$69.6million $$