Simple recipes for exploring data - the what, the why and the how.

Making better decisions with the Brier score

Whether we like it or not, we all make decisions and forecasts - large and small - every hour of every day.

In his 2015 book, Superforecasting: The Art and Science of Prediction, political scientist Philip Tetlock revealed that based on almost two decades of research, the average “expert” was roughly as accurate as a “dart-throwing chimpanzee” in forecasting global events. However, he also discovered that a small group of people turned out to have genuine skill, and they seemed to have one key trait in common: they were great at being wrong, then learning from their mistakes to course-correct, much like a captain steering his ship. So how can we apply this in our own lives?

One way is to use the Brier score. Named after meteorologist and statistician Glenn Wilson Brier, the Brier score is simply a number between 0 (best) and 1 (worst) which measures the accuracy of probabilistic forecasts for binary outcomes. To use the Brier score to learn from our mistakes and sharpen our decisions, we need to do two things:

  1. Make a probabilistic forecast for a binary outcome.
  2. Calculate the Brier score for your forecast.

Let’s start with 1. A probabilistic forecast is a number between 0 and 1 (or 0% and 100% if you prefer) that expresses your level of confidence in your forecast, and a binary outcome is something that will either happen (outcome = 1) or it won’t (outcome = 0). So for example, if you forecast that there is an 80% chance of rain tomorrow, this is a probabilistic forecast for a binary outcome since 80% (or 0.8) is a number between 0 and 1, and it will either rain tomorrow or it won’t.

Next, how do we actually calculate the Brier score? It turns out that the Brier score is simply the mean squared error between the true outcomes and our forecasts, where 0 and 1 are the best and worst possible scores respectively. We can express this in pseudocode as follows:

brier_score = sum((actual_outcomes - forecasts)^2) / number_of_forecasts

To put this into context, let’s look at an example. If I forecast that there is a:

  • 30% chance (forecast_1 = 0.3) of rain tomorrow; and
  • 80% chance (forecast_2 = 0.8) of rain the day after

And the weather turns out to be:

  • Sunny tomorrow (actual_outcome_1 = 0); and
  • Rainy the day after (actual_outcome_2 = 1)

Then my Brier score would be:

brier_score = ((actual_outcome_1 - forecast_1)^2 + (actual_outcome_2 - forecast_2)^2) / 2
            = ((0 - 0.3)^2 + (1 - 0.8)^2) / 2
            = (0.09 + 0.04) / 2
            = 0.065

which is quite good, given that 0 is the best possible score. This may seem strange, since only 1 of my 2 forecasts were correct. It turns out that this is because I assigned a high probability (80%) to the forecast I got correct, and a very low probability (30%) to the one I got wrong. In other words, the level of confidence I have in my forecasts matters more than the number of forecasts I get correct. To see this, if I flip the probabilites above, this time my Brier score would be:

brier_score = ((actual_outcome_1 - forecast_1)^2 + (actual_outcome_2 - forecast_2)^2) / 2
            = ((0 - 0.8)^2 + (1 - 0.3)^2) / 2
            = (0.64 + 0.49) / 2
            = 0.565

which is almost 9 times higher (worse) than my previous Brier score, even though I still got the same number of forecasts correct. This is because I assigned a high probability to the forecast I got wrong this time, which bumped up my Brier score. In other words, I got punished for not knowing what I was doing.

This is the real beauty of the Brier score - you get rewarded if you know what you’re doing (i.e. assign a high probability to a true outcome), and punished if you’re overconfident and don’t know what you’re doing (i.e. assign a high probability to a false outcome). In this way, the Brier score can help us learn from our mistakes, become better decision makers and potentially change our lives - why not give it a try? I’d love to hear how it works out for you.