Just a couple of weeks ago, I described my home-grown analytics service for this site. I was expecting negativity around the post given that people hear the word “tracking” and freak out, but I was actually surprised to see an overwhelmingly positive reaction.

Today, I’m here to describe a couple of updates to this service: namely, the support for comments and the complete removal of client-side fingerprinting.

A blog on operating systems, programming languages, testing, build systems, my own software projects and even personal productivity. Specifics include FreeBSD, Linux, Rust, Bazel and EndBASIC.

0 subscribers

Follow @jmmv on Mastodon Follow @jmmv on Twitter RSS feed

Comments

Since the creation of this blog in 2004, my intent has been to support post comments.

Looking back at my email archives, I can see that the posts were reasonably lively back in the day. Unfortunately, comment activity died off after a while and I think this happened for two reasons:

  • The first is my dubious switch to different commenting technologies over time. Two critical mistakes were the adoption of Google+ comments around 2013 (back when the blog was hosted in Blogger) and the adoption of Disqus around 2016 when the blog moved to a static site.

  • The second is that people don’t seem to enjoy discussing their ideas on the posts themselves any more. These days, discussions happen in sites like Hacker News or Reddit—only if the posts make it there at all.

But I would like to have comments back and, given that I spent the time to build my service as a generic platform for static sites, I took a stab at implementing my own commenting feature.

As happened with the EndBASIC cloud service, implementing the feature itself was easy, but the follow-up steps to make it “production ready” were tedious. In particular, adding some form of abuse protection was as hard, if not more, than the comments feature itself. More on this below.

Go ahead; take this as an invitation to test the posting feature at the bottom of this post 😀. I’m sure there are still rough edges (the JavaScript is still shitty) and things won’t always work… so please let me know off band if that happens.

Fingerprinting no more

The initial version of the analytics tracker implemented client-side fingerprinting via unique UUIDs stored in a cookie (if allowed). Later on, in an attempt to remove cookies altogether, I switched to using FingerprintJS to obtain stable IDs over time—which “worked” but was definitely a worse solution privacy-wise. This had to be rethought.

With this new release, client-side fingerprinting is now completely gone, as are cookies in all cases. However, the service still needs to identify clients with some reasonable precision over a short period of time for two reasons: to distinguish page views from the number of daily visitors and to implement abuse protection mechanisms in the voting and commenting features.

The way this works now is that the server will compute a pseudo-unique identifier for a client based on its IP and user agent. The identifier is then combined with the site’s ID to make the identifier untraceable across sites, and is also combined with a secret salt to make the identifier unpredictable to anyone that looks at the code or at old data (which right now is only myself by the way). There is no need for the service to keep track of the IP any longer after computing this weaker ID and the country of origin.

Abuse protection

My main concern with all of this is spam. A commenting feature without authentication is an open invitation to spammers to post garbage.

In an attempt to mitigate this, I have implemented some rudimentary abuse protection features in the server. These work by looking at the recent behavior of a client and determining whether it is allowed to post or not. The policies are weak right now but the infrastructure is in place, so hopefully it will be easy enough to tweak it over time.

Note, however, that if the server misidentifies you as a bot… there is no way around it. In those cases, there should be a feature to prove that you are not a bot (a la reCAPTCHA)… but I didn’t fancy building it right now.

We’ll see how far this simple approach lets me go. Or not.

What’s next?

An immediate first thing to do would be to try to recover any old comments I still have access to (in Disqus, email archives and, if I can find it, my Google+ export) and import them into the new system.

After that, it’d be great to integrate the service with email, which would open the gate to two extra features: post reply notifications and email subscriptions handling (to ditch Follow.it). I haven’t researched how to send email from a cloud service because it sounds “hard”, so these will have to wait.

For now, though, I want to focus on writing a few new posts and going back to hacking EndBASIC.