Sample sizes and pointless graphs

I had a dream last night.

I was sitting in the pub at lunchtime having a beer with that clever guy from Pointless.  I forget the name.  In this dream, though, he was a pollster and he was telling me that he’d interviewed 500 people that morning (with a profile representative of the whole country) and concluded that 31.0% of the country were going to vote for Labour in the forthcoming general election.

“But Richard, “ I said.  Richard, that’s his name.  “Richard, you’ve only interviewed 500 people.  How do you know that’s a big enough sample size?”

“I’m glad you asked that, chum,” he said.  I don’t think he knew my name either.  And he pulled down a roller blind with this graph on it:


“This graph, Simon,” he said.  Do I even look like a Simon?  “This graph shows how the proportion of people in my sample that say they’re going to vote Labour has evolved as the sample size has gradually increased.  By the time we get to 500 people the graph’s virtually flat.  That proves that I’ve spoken to enough people.”

“Now look,” I said.  Simon!?  “That graph proves nothing.  Any graph plotting the average of the first n observations of a random variable against n will flatten out like that.  You could learn something from actuaries here.  If an actuary is estimating the mean of a random variable by taking the average of n observations, he will measure the standard deviation of those observations and use this to build a 95% confidence interval around his central estimate.  If he knows how narrow a confidence interval he’d be comfortable with, he can even calculate the number of observations required to reach this level of confidence.  How accurate do you need your result to be, Bob?”

“Hmmm…. I think I need to be within ±0.5%.  And don’t call me Bob.”

“Well, looking at the results you have here, Bob, you can only get to a confidence interval of 31.0% ±1.9%.  To get the level of accuracy that you want, you’d need to talk to 7,200 people.”

“7,200?  You must be joking!  I’ll stick with 500 thanks and justify it with this graph.  Nobody understands stats enough to be able to challenge me.”

“You’d be surprised, Bobby boy.  In the actuarial world, we’re always estimating the market consistent value of with profits guarantees by taking the average of a large number of observations.  Justifying the number of observations by showing off confidence intervals is business as usual to us.  Even the most non-technical directors are starting to understand that there’s a direct relationship between level of accuracy and number of observations.  If we tried your stunt with the graph, we’d be laughed out of the room.”

At this point strange things happened. Richard Osmond (that’s his name!) changed into a fire engine and my alarm woke me up.  It was all a crazy dream.  Whatever was I thinking?  I’m sure polling companies must adopt a scientific approach to sample sizes in the real world.

And I just checked Wikipedia. His name is Richard Osman.