From the course: Artificial Intelligence Foundations: Machine Learning
Exploring common classification metrics
From the course: Artificial Intelligence Foundations: Machine Learning
Exploring common classification metrics
- I promised you that we'd explore metrics in detail later on in the course. So here we are. Metrics are key indicators of whether or not your model is well performing, or whether you'll need to tweak the hyperparameters to continue your training iterations. We'll cover several metrics today that are reserved for classification problems: accuracy, F1 score, precision, recall, and AUC. Let's talk about them now. Accuracy, also known as classification accuracy, indicates the prediction capabilities of your model or the fraction of the total predictions that were correct. The formula for accuracy is simple. Accuracy equals the number of correct predictions divided by the total number of predictions. For binary classification, accuracy is also calculated using true positives, true negatives, false positives, and false negatives. The formula is accuracy equals true positive plus true negative divided by true positive plus true negative plus false positive plus false negative. When considering our public safety model, a true positive is where the model predicts the stop will lead to an arrest, and the stop really does lead to an arrest. A true negative is where the model predicts the stop will not lead to an arrest and it actually doesn't. False positive is where the model predicts the stop will lead to an arrest, but it actually doesn't. A false negative is where the model predicts the stop will not lead to an arrest, but it actually does. I do want to highlight that when your dataset is highly imbalanced, accuracy is not a good way to measure your model's performance. Instead, consider using precision, recall, or F1 score, which we'll discuss now. Precision is another way to measure accuracy. Precision quantifies the number of positive predictions that actually belong to the positive class. The formula is simple. Precision equals true positive divided by true positive plus false positive. Use this metric for accuracy when you want fewer false positives. For example, for spam filtering, you don't want spam emails in your inbox, but you don't want to miss out on important emails. In this case, a false positive will flag an email a spam when it isn't, which means you may miss it if you don't regularly check your spam folder. Recall is another way to measure accuracy and highlights the sensitivity of your model. Recall quantifies the number of positive predictions made out of all the actual positives in the dataset. The formula is recall equals true positive divided by true positive plus false negative. Consider the recall metric when false positives are okay and you want fewer false negatives. Let's say you're dealing with anomaly detection to stop credit card fraud. You're okay if an activity is marked as fraud when it's really not. The worst case scenario here would be a fraudulent transaction that's allowed. While this may cause some level of inconvenience for the credit card owner, it's better that their account is protected. F1 score is yet another way to measure accuracy and is a combination of precision and recall, providing a single score. The formula is F1 equals two times precision times recall divided by precision plus recall. The F1 score works well for imbalanced data. The last metric is AUC, or area under the ROC curve, which measures accuracy and visualizes how well predictions are ranked across the true positive and false positive rates. If you care about false positives, then you'd optimize on AUC. There are additional classification metrics that we didn't cover today. Write down a to-do to research additional classification metrics that could be useful to your use case. Now that you can identify and tune classification metrics available to you, let's explore regression metrics.