For non-linear relationships, we can use several functional forms to potentially transform the data to be used in linear regression. The following are the three often-used log transformation functional forms:
- Log-lin model: In this log transformation, the dependent variable is logarithmic while the independent variable is linear. It is represented as \(lnY_i = b_0 + b_1 X_i\). The slope coefficient in the log-lin model is the relative change in the dependent variable for an absolute change in the independent variable. That is, a one-unit increment in \(X\) triggers a \(100b_1\) percent change in \(Y\).
- Lin-log model: In this case, the dependent variable is linear while the independent variable is logarithmic. It is represented as \(Y_i = b_0 + b_1 lnX_i\). The slope coefficient in the lin-log model is responsible for the absolute change in the dependent variable for a relative change in the independent variable. Put another way, if X increases by 1%, \(Y\) will change by \(b_1\) per 100 units.
- Log-log model: In this log transformation, both the dependent and independent variables are logarithmic. It is represented as \(lnY_i = b_0 + b_1 lnX_i\). The slope coefficient in the log-log model is the relative change in the dependent variable for a relative change in the independent variable. In other words, if \(X\) increases by 1%, \(Y\) will change by \(b_1\)
Selecting the Correct Functional Form
To settle on the correct functional form, consider the following goodness of fit measures:
- Coefficient of determination (R2). High value of is better.
- F-statistic. High value of F-statistic is better.
- Standard error of the estimate (Se). A low value of Se is better.
Aside from the factors cited above, the patterns in residuals can also be analyzed when evaluating a model. Residuals are random and uncorrelated in a good model.