I’ve recently run into a paradoxical situation while training a network to distinguish between to classes.
I’ve used cross entropy as my loss of choice. On my training set the loss steadily decreased while the F1-Score improved. On the validation set the loss decreased shortly before increasing and leveling off around ~2, normally a clear sign for overfitting. However the F1-Score on the validation set kept rising and reached ~0.92, with similarly high presicion and recall.
As I never took a closer look at the relation between F1-Score and the cross-entropy loss, I’ve decided to do a quick simulation and plotted the results. The plots show the cross-entropy in relation to the F1-Score. On the first graph I varied the range in which the misses landed, while hits where always perfect. The graph shows that misses with a confidence create higher losses, even at high F1-Scores. In contrast the confidence of hits has a negligible influence on the loss as depicted in the second graph.
It seems likely that my network developed a high confidence in its prediction, only answering 0 or 1, which provoked a high loss while still achieving reasonable high accuracy.