Whether you are running an A/B Test or a Multi-armed Bandit test, there are many common traps you can fall into, which will cause your results to be misleading. Traps, you say? We must avoid them! These are some of the most common traps and how to avoid them:
- Small samples. Yes, you may have 1 million customers but how many of them use the feature you are going to test? If you only have 100 customers using that feature you may not have a large enough sample to get reliable results from your A/B Test. Before running a test, be sure to understand the required sample size and that you have enough customer activity to create the observations you need.
- Vague Hypotheses. Your test is designed to test something, but what is that something? If you aren’t crystal clear on what you are testing and what the expected results are from the test then you won’t be able to trust results. It is not as simple as “I think Option A will generate more revenue”. You need to be specific in your test hypothesis or else you can’t guarantee that other factors influenced the result. A good test might be “I think that changing our email subject line to X will increase the email option rates by Y%”.
- Competing Tests. If you run more than one test at a time with the same group of customers, your tests may be competing with each other. How will you know if the improvement you see in Option A from Test 1 is real or a result of Option B of Test 2? Running multiple tests can cause all sorts of data pollution when tests share the same customers.
That last trap is a doozy! I have no doubt that you want to run more than one test at a time, but if doing so jeopardizes all of your tests what should you do? We’ll cover that tomorrow when we go over running simultaneous tests on the same customers.
Quote of the Day: “It’s a trap!” – Admiral Ackbar, Star Wars: The Return of the Jedi