F1-Score rises while Loss keeps increasing

I’ve recently run into a paradoxical situation while training a network to distinguish between to classes.

I’ve used cross entropy as my loss of choice. On my training set the loss steadily decreased while the F1-Score improved. On the validation set the loss decreased shortly before increasing and leveling off around ~2, normally a clear sign for overfitting. However the F1-Score on the validation set kept rising and reached ~0.92, with similarly high presicion and recall.

As I never took a closer look at the relation between F1-Score and the cross-entropy loss, I’ve decided to do a quick simulation and plotted the results. The plots show the cross-entropy in relation to the F1-Score. On the first graph I varied the range in which the misses landed, while hits where always perfect. The graph shows that misses with a confidence create higher losses, even at high F1-Scores. In contrast the confidence of hits has a negligible influence on the loss as depicted in the second graph.

It seems likely that my network developed a high confidence in its prediction, only answering 0 or 1, which provoked a high loss while still achieving reasonable high accuracy.


F1-Score and Cross Entropy with varying Miss Ranges F1-Score and Cross Entropy with varying Confidence

Loss of Attribution

The digitalisation of the world and the Internet changed the rules in many, if not all, fields. One major change I just realised is the loss of attribution. I often made these errors when in contact with opinions that I didn’t share. When it came to my personal views and groups I feel associated to, I felt mistreated or misunderstood if people made general claims about such group.

Before the internet it was relatively easy to attribute an action or statement to an organisation, such as a political party or a state. Actions were attributable because evidence was impossible, or at least very hard, to forge (without leaving a trace). Statements made by an individual could be traced back as opinion were exchanged by direct contact.

This changed in the digital world. Data can be manipulated without leaving traces and left traces are hard to find and could also be created on purpose.

For this reason actions in the internet, such as hacking an organisation, can’t be attributed to someone only based on the traces left in the process. Hard, physical proof is needed to do such an attribution.

Hacks are generally attributed to the Russian government at the moment. This attribution explains (some) features of found malware and traces. However a skilled hacker could have avoided such traces or planted them on purpose.

Statements made by an individual are often labeled as being left, right or liberal believes, but are just the interpretations and thoughts of a single person. Even if this person associates itself with a specific group doesn’t make his statements general believes of such group. The association of an individual with a group got more fluid in the internet. One only has to claim being associated to be seen as voice of such group.

This error, where association is confused with attribution, can often be seen in today’s debates. For example, feminists are viewed as crazy women who want to see men suffer because a few individuals with extreme viewpoints identify themselves as feminists. The political right is often set on one level with Nazis because, again, few individuals who call themselves “right” spread hatred against foreigners. The political left is confused with people that despise all authority or want to recreate the communism of the cold war.

In conclusion, our old techniques of attribution don’t work any longer in the digital world. Actions can’t be attributed and statements made by individuals only show their views, not general views of a group. This makes policital and social debates more complex, as organisations (movements/parties) can’t be described by pointing on individual events or statements. Other techniques are required to pinpoint the general believes of a group.