Libove Blog

Personal Blog about anything - mostly programming, cooking and random thoughts





Thoughts on Analytics

Counting web page visitors accurately can be a challenging task, often bordering on the impossible.

There are two common methods to count visitors: extract the number from server logs or using an analytics software with an HTML snippet. Both approaches have limitations which will result in distorted numbers.

Using analytics snippets tends to result in an underestimation of visitor counts. A significant portion of internet users have an AdBlocker, leading to their exclusion from the count. Additionally, certain users might disable JavaScript or refrain from loading resources from external domains, further contributing to the under counting issue.

On the other hand, relying solely on server logs tends to overstate the number of visitors. This is because a significant portion of server logs comprises entries generated by bots and crawlers, which should ideally be filtered out to arrive at an accurate visitor count. Yet, many bots attempt to conceal their identity, making this filtering process challenging. Furthermore, some legitimate visitors might be counted multiple times, for example when they switch networks and acquire new IP addresses.

The true number of visitors will lie somewhere between the value produced by your analytics snippet and a number calculated from server logs. Unfortunately there is no way to retrieve the "true" number of visitors.

In conclusion: There is no true visitor number, it always depends on how you count and your definition of a visitor.


Podcast de facto Standard

I built a website and crawler to analyze the podcast ecosystem. The website contains various reports about the usage feed tags and audio properties.

Podcast de facto Standard

For the information about tag usage, I used defusedxml and written some basic validators to check that the tags included are also valid according to there specification.

The audio analysis is done using ffmpeg and ffprobe. I only extract basic features provided by these tools.

I hope this information is helpful to the podcasting community and people building their own podcasting system.





Now running on Owl Blogs 2.0

Upgraded my blog to 2.0 of Owl Blogs. This is a 100% rewrite of my blog software. Some features, such as webmention, are not ported yet, but this new implementation allows me to be more flexible with features.

My next step will most likely be POSSE as I often post photo to multiple sites manually.