###### Estimating the Parameters of a Simple ...

While conducting a regression analysis, we start with the dependent variable whose variation... **Read More**

The multiple coefficients of determination, R^{2}, can be used to test the overall usefulness of the overall set of independent variables in explaining the dependent variable. Multiple R^{2} can be interpreted as the percentage of the dependent variables’ total variability that can be collectively explained by all the independent variables.

It is calculated similarly to the case of the simple regression model:

$$\begin{align*}R^{2}&=\frac{\text{Total Variation}-\text{Unexplained Variation}}{\text{Total Variation}} \\&= \frac{\text{Regression Sum of Squares (RSS)}}{\text{Total Variation (SST)}}\end{align*}$$

However, multiple R^{2 }is less useful in measuring the goodness of fit of a multiple regression model. This is because it increases each time you add new independent variables, even if the variation explained by them may not be statistically significant. An overfitted model contains deceptively high multiple R^{2} values thus have a decreased ability to make precise predictions.

Adjusted R^{2} , \(\overline{R}^{2}\), adjusts for the number of independent variables in the model. Its value increases only when the added independent variables improve the fit of the regression model. Furthermore, it decreases when the added variables do not improve the model fit by a good enough amount.

The relationship between \(R^{2}\) and \(\overline{R}^{2}\) is expressed as:

$$\overline{R}^{2}=1-\bigg(\frac{n-1}{n-k-1}\bigg)(1-R^{2})$$

Where:

- n = number of observations.
- k = number of the independent variables (slope coefficients).

The adjusted \(\overline{R}^{2}\) can be negative if \(R^{2}\) is low enough. However, multiple \(R^{2}\) is always positive.

Consider the following table showing the number of independent variables and the corresponding R^{2} and the adjusted R^{2}.

$$\small{\begin{array}{c|c|c} {\textbf{No of Independent}\\ \textbf{Variables}}&\bf{R^{2}}&\textbf{Adjusted } \bf{R^{2}}\\ \hline1 & 75.4 & 74.3 \\ \hline2 & 89.2 & 88.1\\ \hline3 & 90.7 & 89.2\\ \hline4 & 92.4 & 85.6\\ \hline5 & 93.2 & 84.0\\ \end{array}}$$

Notice the following key points from the above table:

- Adjusted R
^{2}is always less than or equal to the multiple R^{2}. - Multiple R
^{2}is greater than the adjusted R^{2}when the number of independent variables is at least one. - As the number of independent variables increases, adjusted R
^{2}increases up to a certain point (in this case 3 independent variables), beyond which it starts decreasing. - One might want to include only three independent variables in their regression model.

The fact that the regression model has a high adjusted R^{2} does not mean that it is based on only the correct variables. Several other factors need to be considered before concluding that the model is well specified**.**

## Question

Which of the following is

most appropriateabout adjusted R^{2}?A. It is nondecreasing in the number of independent variables.

B. It may or may not increase when one adds an independent variable.

C. It is always positive.

## Solution

The correct answer is B.The value of the adjusted R

^{2}increases only when the added independent variables improve the fit of the regression model. Moreover, it decreases when the added variables do not improve the model fit by a good enough amount.

The adjusted RA is incorrect.^{2}can decreasewhen the added variables do not improve the model fit by a good enough amount. However, multiple R^{2}is nondecreasing in the number of independent variables, so it is less reliable as a measure of goodness of fit in regression with more than one independent variable than in a one-independent variable regression.

The adjusted \(\overline{R}^{2}\) can be negative if \(R^{2}\) is low enough. However, multiple \(R^{2}\) is always positive.C is incorrect.

Reading 2: Multiple Regression

*LOS 2 (h) Contrast and interpret the \(R^{2}\)** and adjusted *\(R^{2}\)* in multiple regression.*