Simple guide to Confusion Matrix

Confusion matrix

Putting it in a few words, a confusion matrix is a summarization of the performance of an algorithm. It is a table that describes the performance of a classifier model with known labels.

Confusion matrix is well used in Machine Learning because it not only indicates the errors made by the model but also describes the types of error.

Confusion matrix Image 1: Example of a Confusion matrix

   Let’s have a look at what does the table is referring:

TP – True Positives: model predicted Positive and actual class is also Positive.
FP – False Positives: model predicted Positive, but the actual class is Negative. (aka Type I error)
FN – False Negatives: model predicted False but the actual class is Positive. (aka Type II error)
TN – True Negatives: model predicted Negative and actual class is also Negative.


As you can see, the confusion matrix is the count of each error type the model made. Having this information, we can calculate the following metrics:   

Accuracy: Percentage of the correct classifications the model made from all the observations.

\begin{equation} Accuracy = \frac{TP + TN}{TP + TN + FP + FN} \end{equation}

Precision: Percentage of the correctly predicted Positive observations among the total of the predicted Positive observations. In other words, it’s the percentage of the positive predicts that are correct.

\begin{equation} Precision = \frac{TP}{TP + FP} \end{equation}

Recall or Sensibility: Is the percentage of the correctly predicted Positive observations among the total of the Positive observations.

\begin{equation} Recall = \frac{TP}{TP + FN} \end{equation}

F1 score: Combines precision and recall. Can be interpreted as the weighted average of the precision and recall on a scale from 0 to 1, where 1 means a perfect classification.

\begin{equation} F1-score = \frac{2∗(PRECISION ∗ RECALL)}{PRECISION + RECALL} \end{equation}

Category: Machine learning