The confusion matrix and threshold-variant metrics

Many of the most helpful and intuitive metrics of learner/model performance depend on converting predicted probabilities into predicted binary classifications (or ‘0’ or ‘1’ predictions of the outcome). To do this, we first need to specify a threshold. After selecting a threshold, predicted probabilities above the threshold indicate that the predicted outcome equals ‘1’ (or ‘yes’), and predicted probabilities below the threshold indicate that the predicted outcome equals ‘0’ (or ‘no’). For example, we may specify a threshold of 0.5. Then, turning back to our example on the previous page, any student with a predicted probability greater than or equal to 0.5 is considered “predicted to drop-out,” while any student with a predicted probability less than 0.5 is considered “not predicted to drop-out.”

Later, we will discuss how to pick a threshold. For now, once we have picked a threshold, we can then compare each resulting predicted classification (i.e. predicted outcome) to the actual, true, observed classification (e.g. whether or not a student actually dropped out.) Sometimes We can summarize the comparisons in 2x2 matrix called a confusion matrix:

Within a confusion matrix, observations are categorized into one of the following cells:

Note that for “true positives” and “false positives”, the predicted and observed classification match, while in “false positives” and “false negatives” we make a prediction that does not match the true classification.

We also calculate the following totals:

Finally, we can also summarize these categories into many different rates. Below is a list of various rates that may be helpful. In a later page, we will discuss how to pick different metrics to align with goals for the project.

Rates based on total true, observed number of one classification (OP or ON):

\[\begin{equation} TPR = \frac{TP}{TP + FN} = \frac{TP}{OP} \end{equation}\] \[\begin{equation} TNR = \frac{TN}{TN + FP} = \frac{TN}{ON} \end{equation}\] \[\begin{equation} FPR = \frac{FP}{TN + FP} = \frac{FP}{ON} \end{equation}\] \[\begin{equation} FNR = \frac{FN}{TP + FN} = \frac{FN}{OP} \end{equation}\]

Rates based on total predicted number of one classification (PP or PN):

\[\begin{equation} PPV = \frac{TP}{TP + FP} = \frac{TP}{PP} \end{equation}\] \[\begin{equation} NPV = \frac{TN}{TN + FN} = \frac{TN}{PN} \end{equation}\] \[\begin{equation} FDR = \frac{FP}{FP + TP} = \frac{FP}{PP} \end{equation}\]

Rate based on total sample size (total number of predictions):

\[\begin{equation} FDR = \frac{TP + TN}{TP + FP + TP + TN} \end{equation}\]

A note on accuracy: Accuracy is highly sensitive to the true probability of success. Imagine a scenario where 95% of people are truly successful. You create an algorithm or model that predicts every single person will succeed. Your model will be 95% accurate! Thus, you should be careful when using accuracy as a performance metric.

Back to top