“It won’t work,” I said. “Your results are too small to be statistically significant.”
They disagreed, rolled out as planned ... and got drastically different results from the prior year.
You don’t need a degree in statistics to understand this simple rule: For a test to be valid, you need at least 50 responses (orders) in each cell. Any cell that gets fewer is meaningless.
And you can turn this around. When creating a test, plan to get at least 50 responses per cell. That’s the reason behind the 5,000-name minimum that most list rentals institute. A 1-percent response rate from 5,000 names will yield 50 orders — just enough to be statistically significant. If you’re expecting less than a 1-percent response rate, rent more names to bring your expected order quantity up to a minimum of 50 per cell.
The No-test Test
A national cataloger created a beautifully simple split-test. It divided in half a buyer group; made sure each group was large enough to yield statistically significant results; gave each group a different mail code; tracked the mail codes at order-taking time; and after all sales were in, was rewarded with a very clear winner. The test segment had significantly outperformed the control segment.
So what had it actually tested that produced this clear result? Absolutely nothing — the test and control groups were treated the same at each point in the test. They got identical catalogs at identical times.
I once performed the above test myself in cooperation with one of my catalog clients. We called it the “no-test test,” because we went through all the motions of a test, only we didn’t do a single thing differently between the control and test segments. The results from both segments basically should have been the same — but they weren’t.