Thursday, March 01, 2007

Statistics

How can we rigorously justify any predictions that we make about the world? There are at least four big problems: 1) the world is non-stationary (i.e. any patterns, even dynamical patterns, are constantly evolving in time), 2) we are always only sampling the world (never getting a complete distribution), 3) processes that seem to be random are usually just encoding information that we don't have, and 4) we never know the structure of the system.

For example, if you know that you have 100 people, and 20 of them are smokers, then it makes sense to say "if you choose one person at random, the probability of him being a smoker is 20%". But what if you are given a die, and you ask the probability that it will land on "6" on the next roll. Now the "universe" of possibilities is all the rolls that you could ever make with that die (an infinite set), and the probability is the fraction of those rolls that land on "6".

If you start with the assumption that it's a fair die (and this is what people usually do, not just with dice but also in science), then you can easily answer "the probability is 1/6th", because you are assuming (not measuring) the whole distribution.

If you don't make that assumption, then the best you can do is roll the die many times and count how many times it lands on "6". But while you're doing this, the die is changing. Molecules are rubbing off the surfaces, the molecular lattice of the material is changing, etc. Also, to really answer "what is the probability?", you have to specify which information you're allowed to use. The way the die is thrown has a huge impact on its final position. If you know the way the die is thrown, then the probability is probably close to either 0 or 1.

The final problem, "we never know the structure of the system", is the worst problem. What if the die lands on an edge rather than a side (maybe it's being rolled on carpet)? This does more than adjust the probability distribution over the known possibilities: it pulls us out of the entire space we thought we were in. Regardless of what system you define, there are always other factors unraveling the edges. Probabilities are only meaningful within a fictional formal system.

No comments: