###### Assumptions Underlying Linear Regression

Assume that we have samples of size \(n\) for dependent variable \(Y\) and... **Read More**

Measures of dispersion are used to describe the variability or spread in a sample or population. They are usually used in conjunction with measures of central tendency, such as the mean and the median. Specifically, measures of dispersion are the range, variance, absolute deviation, and standard deviation.

Measures of dispersion are essential because they give us an idea of how well the measures of central tendency represent the data. For example, if the standard deviation is large, then there are large differences between individual data points. Consequently, the mean may not be representative of the data.

The range is the difference between the highest and the lowest values in a dataset, i.e.,

$$\text{Range = Maximum value – Minimum value}$$

Consider the following scores of 10 level I candidates:

{78 56 67 51 43 89 57 67 78 50}

$$\text{Range}=89–43=46$$

**Advantage of the Range**

- The range is easy to compute.

**Disadvantages of the Range**

- The range is not a reliable dispersion measure. It provides limited information about the distribution because it uses only two data points.
- The range is sensitive to outliers.

MAD is a measure of dispersion representing the **average of the absolute values** of the deviations of individual observations from the arithmetic mean. Therefore,

$$\text{MAD}=\ \frac{\sum\left|X_i-\bar{X}\right|}{n}$$

Remember that the sum of deviations from the arithmetic means is always zero, which is why we use **absolute** **values**.

**Example: Calculating Mean Absolute Deviation**

Six financial analysts have reported the following returns on six different large-cap stocks over 2021:

{6% 7% 12% 2% 3% 11%}

Calculate the mean absolute deviation and interpret it.

**Solution**

First, we have to calculate the arithmetic mean:

$$\bar{X}=\frac{\left(6\%+7\%+12\%+2\%+3\%+11\%\right)}{6}=6.83\%$$

Next, we can now compute the MAD:

$$ \begin{align*} \text{MAD} & = \cfrac {\left\{ |6\% – 6.83\%|+ |7\% – 6.83\%| + |12\% – 6.83\%| + |2\% – 6.83\%| + |3\% – 6.83\%| + |11\% – 6.83\%| \right\}} {6} \\ & =\cfrac {0.83+0.17+5.17+4.83+3.83+4.17}{6} \\ & = 3.17\% \\ \end{align*} $$

* Interpretation*: On average, an individual return deviates by 3.17% from the mean return of 6.83%.

The sample variance,\(s^2\), is the measure of dispersion that applies when working with a sample instead of a population.

$$ { s }^{ 2 }=\frac { \sum { { \left( { X }_{ i }- \bar { X } \right) }^{ 2 } } }{ n-1 } $$

Where:

\(\bar{X}\) = Sample mean.

\(n\) = Number of observations.

Note that we are dividing by \(n – 1\). This is necessary to remove **bias**.

The sample standard deviation, \(s\), is simply the square root of the sample variance.

$$s=\sqrt{s^2}=\sqrt{\frac{\left(X_i-\bar{X}\right)^2}{n-1}}$$

Assume that the returns realized in the previous example were sampled from a population comprising 100 returns. The sample mean and the corresponding sample variance are *closest *to:

**Solution**

The sample mean will still be 6.83%.

Hence,

$$ \begin{align*} { s }^{ 2 } & =\frac { \left\{ { \left( 6\%-6.83 \%\right) }^{ 2 }+{ \left( 7\%-6.83\% \right) }^{ 2 }+{ \left( 12\%-6.83\% \right) }^{ 2 }+{ \left( 2\%-6.83\% \right) }^{ 2 }+{ \left( 3\%-6.83\% \right) }^{ 2 }+{ \left( 11\%-6.83\% \right) }^{ 2 } \right\} }{ 5 } \\ & = 16.57(\%^2) \\ & = 0.001657 \\ \end{align*} $$

Therefore,

$$ \begin{align*} s & = 0.001657^{\frac {1}{2}} \\ & = 0.0407 \end{align*} $$

When trying to estimate downside risk (i.e., returns below the mean), we can use the

following measures:

**Semi-variance**: The average squared deviation below the mean.**Semi-deviation**(also known as semi-standard deviation): The positive square root of semi-variance.**Target semi-variance**: The sum of the squared deviations from a specific target return.**Target semi-deviation**: The square root of target semi-variance.

The target semi deviation, \(s_{\text {Target }}\), is calculated as follows:

$$s_{\text {Target }}=\sqrt{ \sum_{\text {for all } X_{i} \leq B}^{n} \frac{\left(X_{i}-B\right)^{2}}{n-1}}$$

Where \(B\) is the target and \(n\) is the total number of sample observations.

Yearly returns of an equity mutual fund are provided as follows.

$$

\begin{array}{c|c}

\textbf { Month } & \textbf { Return % } \\

\hline 2010 & 36 \% \\

\hline 2011 & 29 \% \\

\hline 2012 & 10 \% \\

\hline 2013 & 52 \% \\

\hline 2014 & 41 \% \\

\hline 2015 & 16 \% \\

\hline 2016 & 10 \% \\

\hline 2017 & 23 \% \\

\hline 2018 & -10 \% \\

\hline 2019 & -19 \% \\

\hline 2020 & 2 \% \\

\end{array}

$$

What is the target downside deviation if the target return is 20%?

**Solution**

$$

\begin{array}{c|c|c|c|c}

\textbf { Month } & \begin{array}{c}

\textbf { Return } \\

\%

\end{array} & \begin{array}{c}

\textbf { Deviation } \\

\textbf { from the 20% } \\

\textbf { target }

\end{array} & \begin{array}{c}

\textbf { Deviation } \\

\textbf { below the } \\

\textbf { target }

\end{array} & \begin{array}{c}

\textbf { Squared } \\

\textbf { deviations } \\

\textbf { below the } \\

\textbf { target }

\end{array} \\

\hline 2010 & 36.00 & 16.00 & – & – \\

\hline 2011 & 29.00 & 9.00 & – & – \\

\hline 2012 & 10.00 & (10.00) & (10.00) & 100 \\

\hline 2013 & 52.00 & 32.00 & – & \\

\hline 2014 & 41.00 & 21.00 & – & \\

\hline 2015 & 16.00 & (4.00) & (4.00) & 16 \\

\hline 2016 & 10.00 & (10.00) & (10.00) & 100 \\

\hline 2017 & 23.00 & 3.00 & – & \\

\hline 2018 & (10.00) & (30.00) & (30.00) & 900 \\

\hline 2019 & (19.00) & (39.00) & (39.00) & 1,521 \\

\hline 2020 & 2.00 & (18.00) & (18.00) & 324 \\

\hline {\text { Sum }} & {}&{}&{}&{\textbf{2,961}}\\

\end{array}

$$

Here \(n = 11 – 1 = 10\) so that:

$$\text{Target semi-deviation} = \left(\frac{2961 }{10}\right)^{0.5} = 17.21\%$$

The coefficient of variation, \(CV\), is a measure of spread that describes the amount of variability of data relative to its mean. It has **no units**, so we can use it as an alternative to the standard deviation to compare the variability of data sets that have different means. The coefficient of variation is given by:

$$ \text{CV} = \cfrac {s}{\bar{X}} $$

Where:

\(s\) = Standard deviation of a sample.

\(\bar{X}\) = Mean of the sample.

**Note****: **The formula can be replaced with \(\frac{σ}{μ}\) when dealing with a population.

**Procedure to Follow While Calculating the Coefficient of Variation**:

- Compute the mean of the data.
- Calculate the sample standard deviation of the data set, \(s\).
- Find the ratio of \(s\) to the mean, \(x̄\).

What is the relative variability for the samples 40, 46, 34, 35, and 45 of a population?

**Solution**

**Step 1**: Calculate the mean.

$$ \text{Mean} =\cfrac {(40 + 46 + 34 + 35 + 45)}{5} =\cfrac {200}{5} = 40 $$

**Step 2**: Calculate the sample standard deviation. (Start with the variance, \(s^2\).)

$$ \begin{align*} s^2 & =\cfrac {{(40 – 40)^2 + … + (45 – 40)^2 }}{4} \\ &=\cfrac {122}{4} \\ & = 30.5 \\ \end{align*} $$

* Note*: Since it is the sample standard deviation (not the population standard deviation), we use \(n – 1\) as the denominator.

Therefore,

$$ s = \sqrt{30.5} = 5.52268 $$

**Step 3**: Calculate the ratio.

$$ \frac{\text{Mean}}{s}=\cfrac {5.52268}{40} = 0.13806 \text{ or } 13.81\% $$

In finance, the coefficient of variation is used to measure the **risk per unit of return**. For example, imagine that the mean monthly return on a T-Bill is 0.5% with a standard deviation of 0.58%. Suppose we have another investment, say, Y, with a 1.5% mean monthly return and standard deviation of 6%; then,

$$ \text{CV}_{\text T-\text {Bill}} =\cfrac {0.58}{0.5} = 1.16 $$

$$ \text{CV}_\text{Y} =\cfrac {6}{1.5} = 4 $$

* Interpretation*: The dispersion per unit monthly return of T-Bills is less than that of Y. Therefore, investment Y is riskier than an investment in T-Bills.

Question 1If a security has a mean expected return of 10% and a standard deviation of 5%, its coefficient of variation is

closestto:

- 0.005.
- 0.500.
- 2.000.

SolutionThe correct answer is

B.$$ \text{CV} = \cfrac {S}{\text x̄} = \cfrac {0.05}{0.10} = 0.5$$

Where:

\(s\) = The standard deviation of the sample.

\(x̄\) = The mean of the sample.

A is incorrect. It assumes the following calculation.$$\text{CV}=\frac{0.05}{10}=0.005$$

C is incorrect. It assumes the following calculation.$$\text{CV}=\frac{10}{5}=2$$

Question 2You have been given the following data:

{12 13 54 56 25}

Assuming that this is a sample from a certain population, the sample standard deviation is

closestto:

- 19.34.
- 374.00.
- 1,870.00.
The correct answer is

A.$$ \bar{X} =\cfrac {(12 + 13 + \cdots +25)}{5} =\cfrac {160}{5} = 32 $$

Hence,

$$ \begin{align*} {s}^{ 2 } & =\frac { \left\{ { \left( 12-32 \right) }^{ 2 }+{ \left( 13-32 \right) }^{ 2 }+{ \left( 54-32 \right) }^{ 2 }+{ \left( 56-32 \right) }^{ 2 }+{ \left( 25-32 \right) }^{ 2 } \right\} }{ 5 } \\ & =\cfrac {1870}{5} = 374 \\ \end{align*} $$

Therefore,

$$ s =\sqrt{374} = 19.34 $$