ROC AUC explained

Evaluating classification models

It is common in predictive modeling to try out a number of different models, apply each to a holdout sample (also called a test or validation sample), and assess their performance.Fundamentally, this amounts to seeing which produces the most accurate predictions.

Important concept for evaluating classification models

Accuracy : The percent (or proportion) of cases classified correctly.

Confusion matrix : A tabular display (2*2 in the binary case) of the record counts by their predict and actual classification status.

Sensitivity (Recall) : The percent (or proportion) of 1s correctly

Specificity : The percent (or proportion) of 0s correctly

Precision : The percent (or proportion) of predicted 1s that are actually 1s.

ROC curve : A plot of sensitivity versus specificity.

What is Confusion Matrix and Why you need it ?

A simple way to measure classification performance is to count the proportion of predictions that are correct. At the heart of classification metrics is confusion matrix. It is a performance measurement for machine learning classification problem where output can be two or more classes. It is a table with 4 different combinations of predicted and actual values.

It is extremely useful for measuring Recall, Precision, Specificity, Accuracy, and most importantly AUC-ROC curve.

Let’s understand TP, FP, FN, TN:

True Positive:

You predicted positive and it’s true.

True Negative:

You predicted negative and it’s true.

False Positive (Type 1 Error):

You predicted positive and it’s false.

False Negative (Type 2 Error):

You predicted negative and it’s false.

What is AUC - ROC Curve?

In the MachineLearning, performance measurement is an essential task.So when it comes to a classification problem, we can count on AUC-ROC curve.

The receiver operating characteristic curve, or ROC curve for short, is used to analyze the behaviour of classifiers at different threshold. Similar to the precision-recall curve, the ROC curve considers all possible thresholds for a given classifier, but instead of reporting precision and recall, it shows the false positive rate (FPR) against the true positive rate (TPR). Recall that true positive rate is simply another name for recall, while the false positive rate is the fraction of false positive out of all negative samples.
To put it briefly, ROC is a probability curve and AUC represents the degree or measure of separability. It tells how much model is capable of distinguishing between classes. Higher the AUC, better the model is at predicting 0s as 0s and 1s as 1s.

The ROC curve is plotted with TPR against the FPR where TPR is on y-axis and FPR is on the x-axis.

How to speculate the performance of the model?

A good model had AUC near to 1 which means it has good measure of separability. A poor model has AUC near to the 0 which means it has worst measure of separability.In fact in means it reciprocating the result. It is predicting 0s as 1s and 1s as 0s. And when AUC is 0.5, it means model has no class separation capacity whatsoever.

Note: Red distribution curve is of the positive class (patients with disease) and green distribution curve is of negative class(patients with no disease).

This is an ideal situation. When two curves don't overlap at all means model has an ideal measure of separation.

When two distribution overlap, we introduce type 1 and type 2 error. Depending upon the threshold, we can minimize or maximize then. When AUC is 0.7, it means there is 70 % chance that model will be able to distinguish between positive class and negative class.

This is the worst case. When AUC is approaching 0.5, model has no discrimination capacity to distinguish between positive class and negative class.

When AUC is approximately 0, model is actually reciprocating the class. It means model is predicting negative class as positive class and vice versa.