Benfords Law is one of the most fascinating things that I have ever come across, however to explain what it is lets play a game (actually the game I learnt about Benford law). Lets firstly list the numbers from 1-9

1 2 3 4 5 6 7 8 9

Now we are going to group these into two groups, one from 1-3 (Group A) and the other from 5-9 (Group B). For the purposes of our game the number 4 will be a dead zone that us no points will be gained on it (keep reading it will make sense).

(1 2 3) ~~4~~ (5 6 7 8 9)

Now the way the game will work is we will pick each pick a different group of numbers group A or group B. Each round we will then times two unknown values together in wolfram alpha and look at the first digit of the number, if it falls in your set of numbers you get a point, first to ten points wins.

So for example it first round might be the mass of the sun * the number of people living in america = 6.14×10^38 person kilograms (mixed units can be strange). The first digit is a 6 therefore group B would get a point.

Again if you searched for US debt * planks number = 15.86 trillion planck US dollars, group A would get a point.

So we start our game and you been a logical rational human begin choose group B because there are more numbers in that particular group and you have more changes of the first digit been one of your numbers. I however foolishly choose group A.

Now let me tell you why I will win almost every time. (hint its due to benfords law)

So we expect the probability of the first digit been any given number to be something like this:

In actual fact the probability distribution looks something like this:

So yeah, Benfords Law essentially states that there is a logarithmic probability distribution of the first digit of most lists of real-world data. This means looking back to our game the actual probability of each group winning per round is:

Group A: 60.2%

Group B: 30.1%

‘Dead Zone’: 9.7%

I’m not going to get into why the probability is distributed this way, however if your interested check out Wikipedia or Wolfram MathWorld. (My way of saying I can’t explain why it is that way)

Benfords Law works for a large range of real life data, testingbenfordslaw.com has many various examples of real life data sets that follow Benfords Law for example twitter users by followers count.

There are however some data sets that do not follow Benfords Law these include

- Data sets where numbers are influenced by human thought (eg. shopping centre prices which are set through psychological thresholds)
- Data sets where numbers are assigned (bank account numbers, telephone numbers…)
- Data sets of bank accounts with a built-in minimum or maximum

In the real world Benfords Law is actually used as part of a set of analytical tools to detect fraud. If data sets in accounting don’t follow the logarithmic probability, accounting fraud has likely taken place. Here is an example of Benfords Law used to detect accounting fraud.

So there you have it. Benfords Law. One of my favourite counterintuitive curiosities.

October 6, 2012 at 6:20 pm

I’ve tested several physical data sets (steam tables, enthalpies, river lengths, atomic weights) against Benford’s Law, and they generally follow the pattern described by the law. The easiest way of explaining it is that a number growing at a fixed rate spends more time with 1 as the leading digit than 2, 2 with the leading digit than 3, and so on.

There’s a spreadsheet at http://investexcel.net/3420/benfords-law-excel/ to help you with your own investigations

October 6, 2012 at 6:44 pm

Cheers, I love your excel spreadsheet by the way.

Pingback: 100 Posts | 19hertz