Analysis of Variance
Sometimes the simple linear regression model does not describe the relationship between two... Read More
Qualitative (categorical) dependent variables are dummy variables used as dependent rather than independent variables. Remember that a dummy variable is a variable that takes on the value 0 or 1. The logistic transformation takes the probability that an event happens, \(p\), divided by the probability that the event will not happen, \(1-p\). A logit is the natural logarithm of the odds of an event happening. The logistic transformation tends to linearize the relationship between the independent and dependent variables.
For example, suppose the probability of a company going bankrupt is 0.6, \(\frac {0.6}{ (1- 0.6)}=1.5\). In that case, the odds of a company becoming bankrupt are 1.5 times more than the probability of the company not going bankrupt. For this reason, we should use the logit model or discriminant analysis when estimating the probability of bankruptcy. The event probability can be calculated as follows:
If \(p\) is binary, we use the maximum likelihood method to estimate logistic regression coefficients instead of using least squares. The maximum likelihood method maximizes the likelihood function for the data. The Bernoulli distribution is chosen as the probability distribution because \(p\) is binary. The maximum likelihood method is iterative. Each iteration will result in a higher log-likelihood until the difference in the log-likelihood of two successive iterations is the same. At this point, the iterating process will stop.
In a logit model, the slope coefficient is the change in the logit that the event happens per unit change in the independent variable. The exponent of the slope coefficient is the ratio of odds that the event will happen with a unit increase in the independent variable.
For a logit regression, the test of the hypothesis regression coefficient is significantly different from zero; it is the same as that of an ordinary linear regression. The overall performance of a logit regression can be evaluated by examining the likelihood chi-square test statistic.
Since logistic regression cannot be fitted using at least-square approach, logistic regression has no equivalent measure for the coefficient of determination. Researchers have proposed Pseudo − \(R^2\) to compare different specifications of the same model. However, it is unsuitable when comparing models with different datasets.
Question
Which of the following measures is least likely to be used in interpreting a logistic regression model?
- R-squared.
- P seudo − \(R^2\).
- P-value.
Solution
The correct answer is A.
R-squared is not used because logistic regression cannot be fitted using a least-square approach.
B is incorrect. The Pseudo − \(R^2\) is used in logistic regression to compare different specifications of the same model, and it is an alternative for the R-squared.
C is incorrect. The p-value is used to evaluate the overall statistical significance of a model.
Reading 4: Extensions of Multiple Regression
Los 4 (c) Formulate and interpret a logistic regression model