The F1 score is a classification accuracy metric that combines precision and recall. It is designed to be useful metric when classifying between unbalanced classes or other cases when simpler metrics could be misleading.
When classifying between two cases (“positive” and “negative”), there are the four possible results of prediction:
Actual Positive | Actual Negative | |
Predicted Positive | True Positives | False Positives |
Predicted Negative | False Negatives | True Negatives |
Precision answers the question, “What fraction of positive predictions are true predictions?”
A cancer diagnostic test that suggested that all patients have cancer would achieve
Precision=TruePositivesTruePositives+FalsePositives
Recall answers the question, “Out of all the true positives, what fraction of them did we identify?”
A cancer diagnostic test that suggested that all patients have cancer would achieve perfect recall, as all patients that actually have cancer would be identified.
Recall=TruePositivesTruePositives+FalseNegatives
The F1 score is a way to combine precision and recall in the following way:
F1=2∗Precision×RecallPrecision+Recall
For a classifier to have a high F1 score, it needs to have high precision and high recall.