### Statistical Correlation Models – Application to Finance

There are three popular correlation models that are statistical which we seek to discuss in this chapter. These models are:

• Spearman rank correlation.
• Pearson correlation measure.
• Kendall $$\tau$$.

Two or more variables usually have a degree of association that is measured by correlation models. We begin by assessing the role of models in finance.

# Financial Models

With the complexity in financial systems, it is rarely possible to find a financial model in which the immense complexity of these financial systems can be replicated. Over the past decades,models meticulous developed by econometricians to replicate this complexity have failed to produce results that are convincing.

Nevertheless, useful models of finance have been developed to enable understanding of the financial system. Some good examples are models live $$VaR$$, Copulas, Black-Scholes-Merton and many more. However, each financial model poses its own limitations and we look into three main aspects of these limitations.

## I. The Financial Model Itself

Since financial models,e.g.,$$VaR$$, $$CVaR$$ and $$BSM$$ are dependent on market prices as inputs, the prices behave randomly and unexpectedly as they are human determined. This tells us that financial models cannot be trusted unconditionally since they are an approximation of reality.

In $$VaR$$ models, the assumption is that asset returns are normally distributed; an assumption challenged by the fat tails of asset returns needing the use of higher kurtosis models. The assumption in the BSM model of pricing option is that all strikes have constant volatility, but in reality volatility smile in currencies as well as volatility skew in equity markets are what traders are what traders apply.

Although seldom, mathematical inconsistencies have been experienced by financial models. Particularly the BSM model, pricing up-and-out calls and puts as well as down-and-out calls and puts it is commonplace to experience mathematical inconsistencies. Should there be equality in the $$KO$$ strike to strike $$K$$, and equality in the interest rate $$r$$ to the underlying asset rerun $$q$$, then implied volatility changes and option maturity experience insensitivity from the model if $$KO = K$$ and $$r = q = 0$$.

Furthermore, a new algorithm has to be developed since, on a standard extension of the BSM model, valuation of lookback approaches is impossible due to equality in interest rate and returns.

## II. The Calibration of the Model

For a model to produce prices found in the market, values for parameters of the model have to be found through calibration of the model. Products with few or no available market prices can then be valued as soon as those parameter values are established.

However, during the calibration of the parameter values, the time frame to be observed is the critical issue since inputting unrealistic values in the model can as well produce realistic outputs which can be misleading. Therefore, simulating extreme scenarios like economic recessions by stress-testing is also important.

It is important to be mindful of the financial models’ disadvantages since it is impossible for them to accurately replicate the financial systems’ complexity. During the financial crisis of 2007-2009, the blind trust of traders and risk managers in new copula correlation model led them to ignore the limitations and consequently suffered losses unforeseen by the copula model due to the following two reasons:

1. The systemic crash violated the copula model’s assumptions.
2. Benign data from low-risk periods were used to calibrate the copula models.

Therefore, human judgment is important when assessing models’ outputs and in considering financial model’s limitations, all outputs should be viewed.

# Statistical Correlation Measures

The Pearson correlation model is the most used scientific concept. However, when applied to financial analysis severe demerits are sometimes observed despite the high popularity. In this subtopic, we study this concept.

## The Pearson Correlation Approach and Its Limitations for Finance

The linear association strength between two models is measured by this approach.

The Pearson correlation coefficient $$\rho$$ is defined as:

$$\rho \left( X,Y \right) =\frac { Cov\left( X,Y \right) }{ \sigma \left( X \right) \sigma \left( Y \right) } \quad \quad \quad \quad \quad \left( a \right)$$

Where $$X$$ and $$Y$$ are sets $$\left\{ { x }_{ 1 },…,x_{ n } \right\}$$ and $$\left\{ { y }_{ 1 },…y_{ n } \right\}$$ respectively $$\forall { x }_{ i }$$ and $${ y }_{ i }\epsilon R$$. $$\sigma \left( X \right)$$ and $$\sigma \left( Y \right)$$ are the standard deviations of $$X$$ and $$Y$$ respectively.

And:

$$Cov\left( X,Y \right) =\frac { 1 }{ n-1 } { \Sigma }_{ t=1 }^{ n }\left( { X }_{ t }-{ \mu }_{ X } \right) \left( { Y }_{ t }-{ \mu }_{ Y } \right)$$

In case of a random set, expectation values are used to quantify the covariance.

Therefore:

$$Cov\left( X,Y \right) =\left[ E\left( X \right) -E\left( X \right) \left( Y \right) -E\left( Y \right) \right] =E\left( XY \right) -E\left( X \right) E\left( Y \right)$$

The variances of $$X$$ and $$Y$$ are:

$${ \sigma }_{ X }^{ 2 }=E\left( { X }^{ 2 } \right) -E{ \left( X \right) }^{ 2 }\quad And\quad { \sigma }_{ Y }^{ 2 }=E\left( { Y }^{ 2 } \right) -E{ \left( Y \right) }^{ 2 }\quad respectively$$

Thus, Equation ($$a$$) becomes:

$${ \rho }_{ 1 }\left( X,Y \right) =\frac { E\left( XY \right) -E\left( X \right) E\left( Y \right) }{ \sqrt { E\left( { X }^{ 2 } \right) -{ \left( E\left( X \right) \right) }^{ 2 } } } \times \frac { 1 }{ \sqrt { E\left( Y^{ 2 } \right) -{ \left( E\left( Y \right) \right) }^{ 2 } } }$$

The following are the causes of demerits in the Pearson correlation approach:

1. Linear dependencies do not appear often in finance
2. Zero correlations do not necessarily mean independence
3. For elliptical joint distribution of variables, with the exception of a few cases, linear correlation measures are natural dependence measures
4. Except for a few cases, sets $$X$$ and $$Y$$ must have finite variances
5. Lack of invariance to transformations by the Pearson correlation approach as opposed to the copula approach.

Therefore, at best the linear Pearson correlation coefficient canbe an estimation for the typically nonlinear relationship between financial variables.

## Spearman’s Rank Correlation

This is a non parametric ordinal correlation measure where the order and not the numerical values of the elements in a set are relevant for the derivation of the correlation and is sometimes called the Pearson correlation coefficient for ranked variables. Regardless of the numerical increase, an increase in the elements $${ x }_{ i }$$ causes $${ y }_{ i }$$ to increase hence resulting in a perfect correlation coefficient of 1. The converse is also true.

To derive the Spearman’s correlation coefficient, we use the following steps:

1. The return set pairs $$X$$ and $$Y$$ have to be ordered first with respect to set $$X$$.
2. The ranks of $${ X }_{ i }$$ and $${ Y }_{ i }$$ are then derived
3. Finally, the difference of the ranks in column six and the square of the difference of column seven are derived.

Let $${ \rho }_{ s }$$ be the Spearman rank correlation coefficient, then:

$${ \rho }_{ s }=1-\frac { 6{ \Sigma }_{ i=1 }^{ n }{ d }_{ i }^{ 2 } }{ n\left( { n }^{ 2 }-1 \right) }$$

According to the Spearman rank correlation concept, the highly negatively correlated returns of assets $$X$$ and $$Y$$ is due to the between -1 and +1 definition of the Spearman rank correlation concept.

## Kendall’s $$\tau$$

The Kendall $$\tau$$ is nonparametric and is defined as:

$$\tau =\frac { { n }_{ c }-{ n }_{ d } }{ { n\left( n-1 \right) }/{ 2 } }$$

Where $${ n }_{ c }$$ and $${ n }_{ d }$$ is the number of concordant data pairs and the number of discordant pairs respectively.

A concordant pair is any observation pair with $${ x }_{ t }>{ y }_{ t }$$ and $${ x }_{ { t }^{ \ast } }>{ y }_{ { t }^{ \ast } }$$ or $${ x }_{ t }<{ y }_{ t }$$ and $${ x }_{ { t }^{ \ast } }<{ y }_{ { t }^{ \ast } }$$ where $$t\neq { t }^{ \ast }$$.

A discordant pair is any observation pair with $${ x }_{ t }>{ y }_{ t }$$ and $${ x }_{ { t }^{ \ast } }<{ y }_{ { t }^{ \ast } }$$ or $${ x }_{ t }>{ y }_{ t }$$ and $${ x }_{ { t }^{ \ast } }<{ y }_{ { t }^{ \ast } }$$ where $$t\neq { t }^{ \ast }$$.

If $${ x }_{ t }={ y }_{ t }$$ or $$x_{ { t }^{ \ast } }=y_{ { t }^{ \ast } }$$ then the pair is neither concordant or discordant.

Regardless of the numerical increase, an increase in the variable $$x$$ causes $$y$$ to increase hence resulting in a perfect correlation coefficient of 1. The converse is also true.

# Should We Apply Spearman’s Rank Correlation and Kendall’s $$\tau$$ in Finance?

Due to the fact that they are ordinal, when analyzing rating categories, measures of rank correlation are always popular. In a copula setting,the Spearman’s rank correlation and Kendall’s $$\tau$$ have previously been used in the analysis of market prices’ dependence and for measuring counterparty risk.

When inferring CDO transaction spreads, a comparison of Kendall’s $$\tau$$ to various copulas has an established significant difference in correlation approaches. In applying ordinal rank correlations to cardinal observations, the challenge is that, to outliers, ordinal correlations are less sensitive.

Inthe Kendall’s $$\tau$$ approach, the main challenge is the omission of the non-concordant and non-discordant pairs in calculations due to their numerous occurrence leading to few concordant and discordant pairs thereby distorting the Kendall’s $$\tau$$ coefficient.

In conclusion, there is a limited application of statistical correlation measures in the assessment of financial correlation thus leading quants to develop specific financial correlation measures.

# Practice Questions

1) Assuming you are given a data set with 12 observation pairs. You are given five concordant pairsand seven discordant pairs.Which of the following is closest to the Kendall’s $$\tau$$.

1. -0.0303
2. -0.0278
3. -0.0152
4. -0.0139

From the equation:

$$\tau =\frac { { n }_{ c }-{ n }_{ d } }{ { n\left( n-1 \right) }/{ 2 } }$$

$${ n }_{ c }$$=5, $${ n }_{ d }$$=7, $$n$$=12

Therefore:

$$\tau =\frac { 5-7 }{ { 12\left( 12-1 \right) }/{ 2 } } =-0.0303$$

2) The standard deviation of data set Ƒ is 35% and $$\beta$$ is 42%. Determine the Pearson correlation coefficient $$\rho ($$Ƒ$$,\beta )$$ if the covariance of Ƒ and $$\beta$$ is given as 0.0112.

1. 0.0762
2. 13.125
3. 0.0009
4. 0.5183

Recall that:

$$\rho \left( X,Y \right) =\frac { Cov\left( X,Y \right) }{ \sigma \left( X \right) \sigma \left( Y \right) }$$

This equation can be rewritten as:

$$\rho ($$Ƒ,$$\beta )$$=$$Cov($$Ƒ,$$\beta )$$ / $$\sigma ($$Ƒ$$)$$$$\sigma \left( \beta \right)$$

Where:

$$\sigma ($$Ƒ$$)$$=0.35,

$$\sigma \left( \beta \right)$$=0.42 and

$$Cov($$Ƒ,$$\beta )$$=0.0112

Therefore:

$$\rho ($$Ƒ,$$\beta )$$= $$\frac { 0.0112 }{ 0.35\times 0.42 } =0.0762$$

3) Calculate the Spearman rank correlation coefficient from the dataset provided below.

$$\begin{array}{|c|cccc|} \hline Rank \quad of & 48911 & 8 & 12 \\ { X }_{ i } & {} & {} & {} \\ \hline Rank \quad of & 5 & 6 & 10126 & 11 \\ { Y }_{ i } & {} & {} & {} \\ \hline \end{array}$$

1. 0.3429
2. 0.6571
3. -1.4
4. 0.6286

From the definition of the Spearman rank correlation coefficient we have that:

$${ \rho }_{ s }=1-\frac { 6{ \Sigma }_{ i=1 }^{ n }{ d }_{ i }^{ 2 } }{ n\left( { n }^{ 2 }-1 \right) }$$

Therefore, we can extend the table to include $${ d }_{ i }$$ and $${ d }_{ i }^{ 2 }$$ where $${ d }_{ i }$$ is the difference in ranks, to have:

$$\begin{array}{|c|cccccc|} \hline Rank \quad of & 4 & 8 & 9 & 11 & 8 & 12 \\ { X }_{ i } & {} & {} & {} \\ \hline Rank \quad of & 5 & 6 & 10 & 12 & 6 & 11 \\ { Y }_{ i } & {} & {} & {} \\ \hline { d }_{ i } & -1 & 2 & -1 & -121 \\ \hline { d }_{ i }^{ 2 } & 1 & 4 & 114 & 1 \\ \hline \end{array}$$

$$\Rightarrow { \Sigma }_{ i=1 }^{ n }{ d }_{ i }^{ 2 }=12$$

$$\Rightarrow { \rho }_{ s }=1-\frac { 6\times 12 }{ 6\left( { 6 }^{ 2 }-1 \right) } =$$

$$= 0.6571$$