A scale to rule them all

It’s part and parcel of my original role as a software engineer (and an old-school RPG lover) that I’ve been around tables and scores my entire life. I’m by no means a scholar, but I’ve always prided myself on being someone who attempts to solve issues through the tools of logic and reason. From a young age, this took on a simple desire: I wanted to quantify everything.

Books. Years. Sentiments. Productivity scores. Sleep quality. The list goes on.

I’ve been tracking data sets long before it got cool, with the introduction of tools such as the Apple Watch, Fitbits, and all kinds of wearables. If you’re new to the world of tracking, let the ramblings of an internet nerd who’s passionate about something weird introduce you to that love: psychometrics.

This constant desire to hack the ratings and metric systems available to me always left me lacking. As a relatively poor kid growing up in Brazil, the option pool available to me, pre-internet, was fairly small. When access to the internet opened up to me, I discovered a field of study called Psychometrics, and quickly fell deep into the rabbit hole of consuming as many academic papers as I could find.

At first, this resulted in more confusion than practical solutions. While what I found in this field made sense for specific, clinical measurements, it didn’t offer me a simple tool that I could generally apply to everything around me. Sure - I can understand that perhaps no single metric system will work for everything, but if you look around with open eyes, you’ll quickly see de facto options that we’ve en masse accepted as widely used and understood scales. These include:

The scale of 1 to 10 (or, even more loathed, 1 to 100)

Let’s imagine you’d like to score how well you slept last night, using an arbitrary system to assign a score between 1 and 10. The problem? The granularity here is immense! To kick us off, ten must be the most PERFECT night ever experienced in the history of human sleep, while one sits at the Bruce-Wayne-after-his-parents-were-murdered end of the "scale" (decidedly without benzodiazepines).

Where does that leave us when positing a five? A night with nothing special, memorable, out of the ordinary? Is five a good score, a neutral score, a disappointing score? To make things more complicated, how about a night of rest scored at a seven? Was this an above-average cosy night of sleep, or a deeply restful, happily dream-filled night without a single toss and turn?

Don’t even get me started on my hatred of 1 to 100. A night scored at a 92?

Fuck off.

This scale, in my understanding, only works when you’re using a system that defines the maximum score (10, or 100) as a baseline, and then removing points based on flaws (like in the case of Olympic sports). Beyond that, it’s useless.

5 stars

I used this rating system for a long time, while rubbing up against the same issue of quantifying things properly. This system demonstrated clear patterns to me with the scoring distribution. Do we assume the minimum score is one star, or zero stars? What are the marks and measurements of difference between 2 stars and 3 stars? Where’s our baseline? Is 3 stars good, average, or underwhelming? Meanwhile, five stars with half-stars is just a shitty way to get back to a 10 point scale. Don’t make me flip a table.

Binary (thumbs up, thumbs down)

The evolution of the internet and expansion of social media platforms has seen us flooded with likes, favorites and reactions, giving us a new way to quantify the success of something. In order to make our online ‘social’ engagement experience even more palatable, major behemoth Facebook even dropped the thumbs-down. No one gets easily quantified negative responses from social media, although that doesn’t let you off the hook from handling the non-quantified negative feedback you may come across in such realms.

Rating something as "good" (1) or "bad" (0) denies us of the opportunity to evaluate things with more depth. If 1 to 10 is too vast, binary is too narrow. My hunt remained to find a middle ground.

My solution

1, 3 and 5 with baseline polarity. There it is, folks.

An interesting finding while using the 5-star rating system for so long was the data distribution I amassed - most of it was 1s, 3s and 5s. 1s indicated a bad outcome (in any level), 3s demonstrated a baseline (good), and 5s would be extraordinary successes (excellent).

After realizing this, I began using only three numbers on my scale, simplifying and strengthening my monitoring capabilities all in one fell swoop. I decided to add an element that allows me to track negative things, such as headaches, stress, or rowdy neighborhood noise levels (thanks, Brazilian funk). I call this defining the baseline polarity.

If the baseline polarity is positive, one is bad, three is good, and five is excellent. If it’s negative, one is good, three is bad, and five is a disaster.

I’ve used this system for a while now and have found it serves me well. It’s implemented in a couple of different ways - firstly, in an Excel spreadsheet that I use to track a number of different things, as well as in my unfinished mobile app Subjective (sign up for the beta if you want to play with it whenever I finish it!), and also as a custom set of Roam Research data tags.

This is an in-progress experiment. I’m not looking to revolutionize the fields of Psychometrics, or reach for the heights of a Nobel prize. If (or when) I find something that works better, I’ll happily trash this system and move myself right along. However, if this is useful to you, feel free to adapt it for your own purposes (and let me know, so I can continue to improve upon it).

Now, off to hunt down a (hopefully) 5-rated lunch.