unpacking a statistic

September 1, 2009

A new study in the Archives of General Psychiatry reports that the percentage of Americans on antidepressants almost doubled from 5.84% in 1996 to 10.12% in 2005. The first thing that probably hits you after reading that is, “Wow, that’s a big jump in ten years.” The next thing might be something like, “I wonder what caused that big of a jump?”

To an educated (but generally ignorant in psychiatry) observer, a few initial explanations seem plausible:
1) The number of depressed people has simply increased, bringing with it the number of people on medication.
2) Cultural acceptance of depression as a legitimate physiological illness has grown, allowing more people to come out into the open and seek treatment they had previously avoided.
3) Our ability to diagnose the illness has improved, allowing us to catch and treat more cases.
4) Our acceptance of (and desire for) drugs to address our problems has increased (to, what many would say — although I’ll withhold judgment — is an unhealthy level).

The authors of the study mainly think it’s 1). In 91/92, the rate of depression in the US was 3.3% and rose to 7.1% in 01/02. This increase in depression itself is an interesting nugget. If fundamentally we think of depression as a physiological disorder, I find it hard to believe that our brain chemistry has really changed that much in ten years. Another explanation is that people are seeking out their doctors more for mental health problems, but then we start spilling into hypothesis 2) and possibly 4).

The authors also suggest an additional hypothesis:

5) Between 1996 and 2005, four new antidepressants (mirtazapine, citalopram, fluvoxamine, and escitalopram) were approved by the FDA for treating depression and anxiety. Furthermore, while the total promotional spending stayed the same, the percentage of that aimed directly at consumers (rather than physicians) increased from 3.3% to 12.0%. So, another explanation of the increase is simply that there are more, and more aggressively marketed drugs out there.

The point I’m trying to make here is that statistics like, “The percentage of Americans using antidepressants doubled between 1996 and 2005.” (from SciAm) and their handling in casual print or conversation often confuse or obfuscate all the issues really at work. Most of us, myself included, don’t put a lot of (or enough) thought into everything going on behind a statistic when we hear or read about them. We’re not critical, skeptical observers, and that’s dangerous.

It’s nice to get out the magnifying glass and do a little digging every once in a while to really understand what all these numbers actually mean.

true randomness

July 1, 2009

We’ve all rolled dice in board games and are confident that those rolls are truly random, i.e. not dependent on any measurable forces. We can’t recreate or predict any type of roll. You’ve probably also flipped a coin, which might seem easier to fix but have probably met a good deal of frustration if you tried to do so.

We take this kind of randomness for granted, but what’s interesting is that for all of our computational prowess, we are unable to create random numbers in computers. Of course, we can create pseudorandom numbers, but not the real deal.

Computers often generate pseudorandom numbers using a starting number, or seed, and then complicated functions to get the next random number. If you supply the same seed number to the randomizing function, you’ll get the same (infinite) stream of “random” numbers out. This property of pseudorandom numbers is actually quite useful when building and debugging computer programs because it allows the programmer to recreate seemingly random scenarios for testing.

When computers are actually trying to generate random numbers, they often use the clock’s timestamp as a seed, since it’s never the same. While this technique does generate a random stream of numbers it’s still, in a way, determined. We can predict what a random number will be given a seed value at a specific time.

In order to achieve true randomness, programmers have had to turn to the real world. Sites like random.org, which generates random numbers from atmospheric noise, and (more recently) the Dice-O-Matic hopper, which physically performs millions of dice rolls today (follow the link for a cool video of it in action), serve up genuine random numbers.

What’s interesting is that, in theory, these numbers aren’t really random either. Sure, they’re products of chaos theory, but they each have predictable forces acting upon them that cause them to behave in a predictable way (remember quantum uncertainty operates on much, much smaller levels than dice rolling). What makes them basically random, though, is that there are so many interacting forces, it’s impossible for us to compute with today’s computing power. But Laplace’s demon could figure it out, which makes me wonder if at some day in the future we won’t be able to predict the outcome of a dice roll, taking us one small step further to predicting the future itself.

(I’m currently reading a very interesting book by Daniel Dennet entitled Freedom Evolves about how to have free will in an determined universe. If this kind of stuff interests you, I’d certainly recommend it.)


May 27, 2009

For many in the real world, e gets no love. e, the base of the natural logarithm, actually has much more power than its other, better-known irrational cousin, pi or π. Most who have taken middle school math can tell you that pi is the ratio between the circumference and diameter of a circle,

No one can doubt the coolness of this relationship, and how its result is a crazy irrational number that never changes with the size of the circle.

The other main irrational number–e–however, does far more than explain a simple geometrical relation.


As x increases from zero (exclusive) to infinity, the function increases from negative infinity to positive infinity. (thanks Wikipedia)

Those who have taken calculus may remember that the graph, , has a derivative of…wait for it…. No other function has this property. In essence, this means exponential functions have the distinct ability to incorporate their results into the next iteration of initial conditions (Thomasina  uses this idea in Tom Stoppard’s excellent play Arcadia).

Since the exponential function involves continual change, it’s perfect for modelling things like population growth, continually compounded interest, and a large array of physical processes from heat transfer to fluid flow, really anything that involves a feedback loop. Differential equations is an area of math that entirely relies upon the exponential function, and differential equations is vitally important for pretty much every field of engineering.

Huge swaths of our modern world rely upon a very simple (albeit never-ending) number. No other number has that kind of power, especially not puny pi or the golden ratio.

For those of you who wasted your time memorizing digits of pi, here are the first two million of e.