Logistic regression

For some starting context: With multiple linear regression, we predict a continuous outcome based on multiple predictors. The model for multiple linear regression looks like this:

\(Y = \beta_0 + \beta_1X_1 + \beta_2X_2 + ... + \beta_kX_k + \epsilon\)

where

The coefficients \(\beta_0\), \(\beta_1\),…,\(\beta_k\) tell us the strength of the relationship between the predictors and the outcome.1

Logistic regression can be used to predict the probability of a particular category of a binary outcome.

If we use a linear regression model to predict probabilities, we can get predicted values that are outside the [0,1] range, which doesn’t make sense for probabilities. To ensure predictions stay in the [0,1] range, logistic regression uses the logistic function:

\(P(Y=1) = \frac{e^{\beta_0 + \beta_1X_1 + ... + \beta_kX_k}}{1 + e^{\beta_0 + \beta_1X_1 + ... + \beta_kX_k}}\)

where

This equation can also be written as follows:

\(\log\left(\frac{P(Y=1)}{P(Y=0)}\right) = \beta_0 + \beta_1X_1 + \beta_2X_2 + ... + \beta_kX_k\)

Here, the log odds (or the logarithm of the odds) of the outcome being 1 (versus 0) is expressed as a linear combination of predictor variables. As we plug in values for the predictors, the linear combination gives us the log odds, which can then be transformed to get the actual probabilities (\(P(Y=1)\) ).2

Unlike many machine learning algorithms, when a data scientist uses logistic regression, they are specifying the specific functional form of the relationship between the predictors and the outcome. Specifically, with logistic regression, we are specifying that the log odds is a linear function of the predictors.

Advantages of Logistic Regression

Disdvantages of Logistic Regression

Implementation of Logistic Regression in R

In R, we implement logistic regression with glm() in Base R, setting family=binomial(link="logit"). The code templates will do this for you.

Back to top

Footnotes

  1. To estimate the coefficient values, we minimize the sum of squared errors across all observations.↩︎

  2. With logistic regression, we use maximum likelihood estimation to find the best-fitting coefficients.↩︎