# Difference average method in sklearn.metrics.classification_report()

Posted by John on 2020-04-27
Words 800 and Reading Time 3 Minutes
Viewed Times

## 前言

• 先上結論，accuracy那行其實是在做micro avg
• 對於micro avg，precision, recall, f1-score是相同的

## F1-score

In general, we prefer classifiers with higher precision and recall scores. However, there is a trade-off between precision and recall: when tuning a classifier, improving the precision score often results in lowering the recall score and vice versa — there is no free lunch.

• no free lunch theorem

F1-score = 2 × (precision × recall)/(precision + recall)

Similar to arithmetic mean, the F1-score will always be somewhere in between precision and mean. But it behaves differently: the F1-score gives a larger weight to lower numbers. For example, when Precision is 100% and Recall is 0%, the F1-score will be 0%, not 50%. Or for example, say that Classifier A has precision=recall=80%, and Classifier B has precision=60%, recall=100%. Arithmetically, the mean of the precision and recall is the same for both models. But when we use F1’s harmonic mean formula, the score for Classifier A will be 80%, and for Classifier B it will be only 75%. Model B’s low precision score pulled down its F1-score.

## F1-score in multi-class

• 42.1%
• 30.8%
• 66.7%

### macro

Macro-F1 = (42.1% + 30.8% + 66.7%) / 3 = 46.5%

macro-f1是unweighted的，對於每個類別的權重是相同的

• 也就是說，在imbalanced的分佈下，數量小的類別對於整體的影響也是相同的

### weighted

Weighted-F1 = (6 × 42.1% + 10 × 30.8% + 9 × 66.7%) / 25 = 46.4%

• 也就是說，在imbalanced的分佈下，數量大的類別將會大大的影響整體的f1效果

### micro

percision = (TP/(TP+FP))

recall = (TP/(TP+FN))

• TP就是對角線上的數字(正確的分對)
• FP/FN就是非對角線上的其他數字，confusion matrix(A, B)可以想成:
• 對於某個positive類別A他卻分錯B(FP)
• 對於某個negative類別B他卻分錯A(FN)

• accuracy不就是confusion上所有數量中，分對的數量?

>