Model Misspecification

Model Misspecification

Model specification involves selecting independent variables to include in the regression and the functional form of the regression equation. Here, comprehensive guidelines are provided for accurately defining a regression, followed by an explanation of common model misspecifications.

Exhibit 1 succinctly presents the principles for proper regression model specification:

  1. Grounding the model in economic reasoning for variable selection.
  2. Ensuring each included variable plays a vital role in the regression.
  3. Assessing the model’s performance beyond the training dataset to prevent overfitting.
  4. Selecting an appropriate functional form, especially when anticipating nonlinear relationships among regressors.
  5. Ensuring adherence to regression assumptions by addressing issues like heteroskedasticity, serial correlation, or multicollinearity.

The subsequent discussion focuses on understanding model specification errors to enhance model development and foster a more informed approach to investment research.

Incorrectly specified functional forms in regression estimation can manifest in various ways. These manifestations include:

  1. Omitted variables.
  2. Misrepresenting relationships between variables.
  3. Inappropriate variable scaling.
  4. Improper data pooling.

Each of these errors may lead to issues like heteroskedasticity or serial correlation, impacting the reliability of regression results.

Omitted Variables

Omitted variable bias occurs when an important independent variable is excluded from a regression. If the true model includes X2 but we estimate without it, like this:


instead of $$ Y_i = b_0 + b_1X_1i + b_2X_2i + ε_i$$

it causes misspecification.

If the omitted variable (\(X_2\)) is uncorrelated with \(X_1\), the misspecified regression’s residual, \(b_2X_2i\) + \(ε_i\), deviates from an expected zero value and lacks an independent identical distribution based on \(X_2\). This bias affects the intercept estimate, but \(X_1\)’s coefficient might still be accurate.

However, if the omitted variable (\(X_2\)) correlates with the included variable (\(X_1\)), the model’s error becomes correlated with \(X_1\). This correlation leads to biased and inconsistent estimations for the regression coefficients, affecting the accuracy of the coefficients, intercept, and residuals, and making standard errors unreliable for statistical tests.

Inappropriate Form of VariablesTop of Form

A common mistake in regression involves using an improper data form instead of a suitable transformed version. For instance, neglecting nonlinearity in the relationship between variables by assuming a linear connection can lead to misspecification. To address this, it’s crucial to consider whether economic theory supports a nonlinear relationship. Plotting the data helps detect nonlinearity; if variables show linearity with proportional changes, transforming them, such as taking the natural logarithm, can rectify this misspecification.

Inappropriate Scaling of Variables

Using unscaled data in regressions instead of scaled data, when scaling would be more suitable, can result in model misspecification. Analysts frequently face the decision of whether to scale variables before comparing data among companies. For instance, analysts commonly employ common-size financial statements to compare companies. These statements streamline comparability across companies, enabling analysts to swiftly assess trends in profitability, leverage, efficiency, and other factors within a group of companies.

Inappropriate Pooling of Data

Improper data pooling occurs when combining samples unsuitably, often due to structural breaks in data behavior. This could stem from changes in regulations or shifts from low to high volatility periods. Such data, in a scatterplot, appears as separate clusters with little correlation due to differing cluster means. Analysts facing discernible subsamples should estimate the model using the most representative data for the forecasting period.


Which of the following is NOT a potential consequence of misspecified functional form in regression analysis?

  1. Heteroskedasticity due to omitted variables.
  2. Multicollinearity caused by inappropriate variable scaling.
  3. Serial correlation arising from inappropriate data pooling.


The correct answer is B.

Misspecified functional form in regression analysis can lead to several issues:

  • Omitted variables may cause heteroskedasticity or serial correlation in the regression results.
  • Inappropriate form of variables might lead to heteroskedasticity if a nonlinear relationship between variables is ignored.
  • Inappropriate variable scaling can cause heteroskedasticity or multicollinearity.
  • Inappropriate data pooling can result in heteroskedasticity or serial correlation in the model’s output.

Stating “Multicollinearity caused by inappropriate variable scaling,” does not align with the outlined consequences of misspecified functional form in the regression analysis. Multicollinearity is generally related to high correlations between independent variables and is not explicitly associated with variable scaling.

Shop CFA® Exam Prep

Offered by AnalystPrep

Featured Shop FRM® Exam Prep Learn with Us

    Subscribe to our newsletter and keep up with the latest and greatest tips for success
    Shop Actuarial Exams Prep Shop Graduate Admission Exam Prep

    Daniel Glyn
    Daniel Glyn
    I have finished my FRM1 thanks to AnalystPrep. And now using AnalystPrep for my FRM2 preparation. Professor Forjan is brilliant. He gives such good explanations and analogies. And more than anything makes learning fun. A big thank you to Analystprep and Professor Forjan. 5 stars all the way!
    michael walshe
    michael walshe
    Professor James' videos are excellent for understanding the underlying theories behind financial engineering / financial analysis. The AnalystPrep videos were better than any of the others that I searched through on YouTube for providing a clear explanation of some concepts, such as Portfolio theory, CAPM, and Arbitrage Pricing theory. Watching these cleared up many of the unclarities I had in my head. Highly recommended.
    Nyka Smith
    Nyka Smith
    Every concept is very well explained by Nilay Arun. kudos to you man!
    Badr Moubile
    Badr Moubile
    Very helpfull!
    Agustin Olcese
    Agustin Olcese
    Excellent explantions, very clear!
    Jaak Jay
    Jaak Jay
    Awesome content, kudos to Prof.James Frojan
    sindhushree reddy
    sindhushree reddy
    Crisp and short ppt of Frm chapters and great explanation with examples.