Save 10% on All AnalystPrep 2026 Study Packages with Coupon Code BLOG10.

Model Misspecification

cfa-level-2 quantitative-method

Model Misspecification

19 Dec 2022

Model specification involves selecting independent variables to include in the regression and the functional form of the regression equation. Here, comprehensive guidelines are provided for accurately defining a regression, followed by an explanation of common model misspecifications.

Exhibit 1 succinctly presents the principles for proper regression model specification:

Grounding the model in economic reasoning for variable selection.
Ensuring each included variable plays a vital role in the regression.
Assessing the model’s performance beyond the training dataset to prevent overfitting.
Selecting an appropriate functional form, especially when anticipating nonlinear relationships among regressors.
Ensuring adherence to regression assumptions by addressing issues like heteroskedasticity, serial correlation, or multicollinearity.

The subsequent discussion focuses on understanding model specification errors to enhance model development and foster a more informed approach to investment research.

Incorrectly specified functional forms in regression estimation can manifest in various ways. These manifestations include:

Omitted variables.
Misrepresenting relationships between variables.
Inappropriate variable scaling.
Improper data pooling.

Each of these errors may lead to issues like heteroskedasticity or serial correlation, impacting the reliability of regression results.

Omitted Variables

Omitted variable bias occurs when an important independent variable is excluded from a regression. If the true model includes X2 but we estimate without it, like this:

$$Y_i=b_0+b_1X_1i+ε_i$$

instead of $$ Y_i = b_0 + b_1X_1i + b_2X_2i + ε_i$$

it causes misspecification.

If the omitted variable (\(X_2\)) is uncorrelated with \(X_1\), the misspecified regression’s residual, \(b_2X_2i\) + \(ε_i\), deviates from an expected zero value and lacks an independent identical distribution based on \(X_2\). This bias affects the intercept estimate, but \(X_1\)’s coefficient might still be accurate.

However, if the omitted variable (\(X_2\)) correlates with the included variable (\(X_1\)), the model’s error becomes correlated with \(X_1\). This correlation leads to biased and inconsistent estimations for the regression coefficients, affecting the accuracy of the coefficients, intercept, and residuals, and making standard errors unreliable for statistical tests.

Inappropriate Form of VariablesTop of Form

A common mistake in regression involves using an improper data form instead of a suitable transformed version. For instance, neglecting nonlinearity in the relationship between variables by assuming a linear connection can lead to misspecification. To address this, it’s crucial to consider whether economic theory supports a nonlinear relationship. Plotting the data helps detect nonlinearity; if variables show linearity with proportional changes, transforming them, such as taking the natural logarithm, can rectify this misspecification.

Inappropriate Scaling of Variables

Using unscaled data in regressions instead of scaled data, when scaling would be more suitable, can result in model misspecification. Analysts frequently face the decision of whether to scale variables before comparing data among companies. For instance, analysts commonly employ common-size financial statements to compare companies. These statements streamline comparability across companies, enabling analysts to swiftly assess trends in profitability, leverage, efficiency, and other factors within a group of companies.

Inappropriate Pooling of Data

Improper data pooling occurs when combining samples unsuitably, often due to structural breaks in data behavior. This could stem from changes in regulations or shifts from low to high volatility periods. Such data, in a scatterplot, appears as separate clusters with little correlation due to differing cluster means. Analysts facing discernible subsamples should estimate the model using the most representative data for the forecasting period.

Question

Which of the following is NOT a potential consequence of misspecified functional form in regression analysis?

Heteroskedasticity due to omitted variables.

Multicollinearity caused by inappropriate variable scaling.

Serial correlation arising from inappropriate data pooling.

Solution

The correct answer is B.

Misspecified functional form in regression analysis can lead to several issues:

Omitted variables may cause heteroskedasticity or serial correlation in the regression results.

Inappropriate form of variables might lead to heteroskedasticity if a nonlinear relationship between variables is ignored.

Inappropriate variable scaling can cause heteroskedasticity or multicollinearity.

Inappropriate data pooling can result in heteroskedasticity or serial correlation in the model’s output.

Stating “Multicollinearity caused by inappropriate variable scaling,” does not align with the outlined consequences of misspecified functional form in the regression analysis. Multicollinearity is generally related to high correlations between independent variables and is not explicitly associated with variable scaling.

Offered by AnalystPrep

Principles for Sound Stress Testing – Practices and Supervision

Country Risk: Determinants, Measures, and Implications

Daniel Glyn

2021-03-24

I have finished my FRM1 thanks to AnalystPrep. And now using AnalystPrep for my FRM2 preparation. Professor Forjan is brilliant. He gives such good explanations and analogies. And more than anything makes learning fun. A big thank you to Analystprep and Professor Forjan. 5 stars all the way!

michael walshe

2021-03-18

Professor James' videos are excellent for understanding the underlying theories behind financial engineering / financial analysis. The AnalystPrep videos were better than any of the others that I searched through on YouTube for providing a clear explanation of some concepts, such as Portfolio theory, CAPM, and Arbitrage Pricing theory. Watching these cleared up many of the unclarities I had in my head. Highly recommended.

Nyka Smith

2021-02-18

Every concept is very well explained by Nilay Arun. kudos to you man!

Badr Moubile

2021-02-13

Very helpfull!

Agustin Olcese

2021-01-27

Excellent explantions, very clear!

Jaak Jay

2021-01-14

Awesome content, kudos to Prof.James Frojan

sindhushree reddy

2021-01-07

Crisp and short ppt of Frm chapters and great explanation with examples.

The Use of Multiple Regression for Forecasting

Explain the Types of Heteroskedasticity and How It Affects Statistical Inference

cfa-level-2quantitative-method

Autoregressive Models and Multiperiod ...

[vsw id=”-SilFtkpBK8″ source=”youtube” width=”611″ height=”344″ autoplay=”no”] The current-time values of a time series... Read More

equity-valuationcfa-level-2

Sensitivity Analysis in FCFF and FCFE ...

Sales growth and profit margins depend on the growth phase of the... Read More

corporate-finance-cfa-level-2cfa-level-2

Sustainability of Cash Dividends

The following are characteristics of companies that may not be able to sustain... Read More

fixed-incomecfa-level-2

Short Term Interest Rate Spreads

Market participants often use short-term interest rate spreads to evaluate liquidity and credit... Read More

Get Ahead on Your Study Prep This Cyber Monday! Save 35% on all CFA® and FRM® Unlimited Packages. Use code CYBERMONDAY at checkout. Offer ends Dec 1st.