Save 10% on All AnalystPrep 2024 Study Packages with Coupon Code BLOG10.

Analysis of Variance

cfa-level-2 quantitative-method

Analysis of Variance

26 Feb 2021

Sometimes the simple linear regression model does not describe the relationship between two variables. To use regression analysis effectively, we must be able to differentiate the two cases.

Breaking down the sum of squares total into its components.

The sum of squares total is the sum of squares regression (SSR) and the sum of squares error. The sum of squares regression is the difference of the sum squared between the mean of the dependent variable and the value of the dependent variable based on the estimated regression line . Hence, SST= SSR+SSE. Let us use an example to explain this.

$${\text{Sum of Squares Total (SST)}}= \sum_{i=1}^n(Y_i-\bar{Y})2 $$

$${\text{Sum of Squares Regression (SSR) }}= \sum_{i=1}^n(\hat{Y_i}-\bar{Y})^2 $$

$${\text{Sum of Squares Error (SSE) }}= \sum_{i=1}^n(Y_i-\hat{Y})2 $$

Exhibit 1: Breakdown of Sum of Squares Total for ROA Model.

$$\small{\begin{array}{l|l|l|l|l|l}\textbf{Company}&{\textbf{ROA}\\ (\textbf{Y}_{\textbf{i}})}&{\textbf{CAPEX}\\ (\textbf{X}_{\textbf{i}})}&{\textbf{Predicted}\\ \textbf{ROA} (\widehat{\textbf{Y}})}&{\textbf{Variation}\\ \textbf{to be}\\ \textbf{Explained}\\(\textbf{Y}_{\textbf{i}}-\bar{\textbf{Y}})^{2}}&{\textbf{Variation}\\ \textbf{Unexplained}\\ (\textbf{Y}_{\textbf{i}}-\widehat{\textbf{Y}_{\textbf{i}}})^{2}}&{\textbf{Variation}\\ \textbf{Explained}\\ (\widehat{\textbf{Y}_{\textbf{i}}}-\bar{\textbf{Y}})^{2}}\\ \hline\text{A} & 15 & 5 & 8.969 & 39.0625 & 23.698& 1.909 \\ \hline \text{B} & 6 & 0.7 & 6.103 & 7.5625 & 0.0107 & 7.005 \\ \hline \text{C} & 10 & 8 & 12.942 & 1.5625 & 8.658 & 17.58\\ \hline\text{D} & 4.0 & 0.4 & 5.822 & 22.5625 & 3.321 & 8.57\\ \hline \textbf{Total} & & & & 70.75 & 35.687 &35.064\\ \hline\text{Mean} & 8.75 & & & &\\ \end{array}}$$

From Exhibit 1 above, we see that
Sum of squares error= 35.687

Sum of squares regression= 35.064

Sum of squares total = 35.687+35.064= 70.75

This sum of squares will be an important input when we come to measure the fit of the regression line.

Measures of Goodness of Fit

The standard error of the regression, the F-statistic, and the coefficient of determination for the test of fit are all measures used to evaluate how well the regression model fits the data (goodness fit). The coefficient of determination or R²measures the proportion of the total variability of the dependent variable explained by the independent variable. R² is calculated using the formula:

$${\text{Coefficient of Determination}}=\frac{\text{Sum of Squares Regression}}{\text{Sum of Squares Total}}$$

$${\text{Coefficient of Determination}} =\frac{{\sum_{i=1}^n(\hat{Y_i}-\bar{Y})^2}}{{\sum_{i=1}^n(Y_i-\bar{Y})2}}$$

The coefficient of determination will range from 0% to 100%. From Exhibit 1 on the ROA regression model, our R²would be 35.064÷ 70.75=0.4956= 49.56% which means that CAPEX explains 49.56% of the variation in ROA. The coefficient of determination is not a statistical test. It is descriptive. To show the statistical significance of a regression model, we use the F-distributed statistic, which is used to compare two variances. For simple regression analysis, F- distributed test statistic is used to determine if the slopes in regression are equal to zero against the alternative hypothesis that at least one slope is not equal to zero.

The F- distributed statistic is formed by using the sum of squares error and the sum of squares regression, with each being adjusted for degrees of freedom. The sum of square regression is divided by the number of independent variables to arrive at the mean square regression (MSR). In simple linear regression, the independent variables are represented by k, which is equal to 1.

$${\text{MSR}}=\frac{\text{Sum of Squares Regression}}{\text{k}}$$

$${\text{MSR}}=\frac{{\sum_{i=1}^n(\hat{Y_i}-\bar{Y})^2}}{{1}}$$

Next, we go ahead and divide the sum of square errors by the degrees of freedom to calculate the mean square error (MSE). In simple linear regression, the degrees of freedom \(n-k-1\) becomes \(n-2\).

$${\text{MSE}}=\frac{\text{Sum of Squares Error}}{\text{n-k-l}}$$

$${\text{MSE}}=\frac{{\sum_{i=1}^n(Y_i-\hat{Y})2}}{{n-2}}$$

Therefore F- distributed test statistic is:

$${\text{F}}=\frac{\text{MSR}}{\text{MSE}}$$

The F- statistic in regression analysis is one-sided. The right side contains the rejection region because we want to determine if the variation in the numerator (Y explained) is larger than the variation in the denominator (Y unexplained).

Question

James, an analyst at QPC LTD, has estimated a model that regresses return on equity ROE against its growth opportunities (GO), which is its three-year compounded annual growth rate in sales over the past 15 years. He was able to estimate the sum of squares error and sum of squares regression as follows:

Sum of Squares Error= 48.99

Sum of Squares Regression= 192.3

The Coefficient of Determination is closest to:

214.29

0797

0.8927

Solution

The correct answer is B.

$${\text{The Coefficient of Determination}} = \frac{\text{Sum of Squares Regression}}{\text{Sum of Squares Total}}$$

First, we calculate the sum of squares total by adding sum of squares regression to sum of squares error. 192.3+48.99= 241.29

R²=192.3÷241.29= 0.797 or 79.7%

A is incorrect. 241.29 is the sum of squares total

B is incorrect. 0.8927 is R which is the square root of the coefficient of determination.

Reading 0: Introduction to Linear Regression

LOS 0 (d) Calculate and interpret the coefficient of determination and the F-statistic in a simple linear regression

Offered by AnalystPrep

Principles for Sound Stress Testing – Practices and Supervision

Country Risk: Determinants, Measures, and Implications

Daniel Glyn

2021-03-24

I have finished my FRM1 thanks to AnalystPrep. And now using AnalystPrep for my FRM2 preparation. Professor Forjan is brilliant. He gives such good explanations and analogies. And more than anything makes learning fun. A big thank you to Analystprep and Professor Forjan. 5 stars all the way!

michael walshe

2021-03-18

Professor James' videos are excellent for understanding the underlying theories behind financial engineering / financial analysis. The AnalystPrep videos were better than any of the others that I searched through on YouTube for providing a clear explanation of some concepts, such as Portfolio theory, CAPM, and Arbitrage Pricing theory. Watching these cleared up many of the unclarities I had in my head. Highly recommended.

Nyka Smith

2021-02-18

Every concept is very well explained by Nilay Arun. kudos to you man!

Badr Moubile

2021-02-13

Very helpfull!

Agustin Olcese

2021-01-27

Excellent explantions, very clear!

Jaak Jay

2021-01-14

Awesome content, kudos to Prof.James Frojan

sindhushree reddy

2021-01-07

Crisp and short ppt of Frm chapters and great explanation with examples.

Assumptions of the Simple Linear Regression Model

ANOVA and Standard Error of Estimate in Simple Linear Regression

alternative-investmentscfa-level-2

Commodity Swaps and Exposure to Commod ...

A commodity swap is a legal contract involving the exchange of payments over... Read More

financial-reporting-and-analysis-fracfa-level-2

Evaluation of the Balance Sheet Qualit ...

Recall from the previous section that high financial reporting quality of the balance... Read More

fixed-incomecfa-level-2

Capped or Floored Floating-Rate Bonds

Capped and floored floaters can be valued using the arbitrage-free framework. Valuation of... Read More

cfa-level-2portfolio-management

The Fundamental Law of Active Portfoli ...

The Basic Law (Unconstrained Portfolio) The basic fundamental law of active portfolio management... Read More