Rich-text Reply

Early stopping rules for A/B testing

Stefan89 11-26-14

Early stopping rules for A/B testing

Ideally you'd want to specify the size of your test sample before starting an A/B test. However, in some cases we might achieve significance early in the experiment. Although it's statistically not correct to eliminate a test before the total sample size has been reached, in some cases you might want to do so to optimize business value.

 

Do you have stopping rules for deciding when you're allowed to stop an experiment earlier than expected?

 

For example, you could decide to stop an experiment when it's at least 99,9% significant at any stage of the experiment, while, at the end of an experiment, you could draw your conclusions when you see 90/95% significance or more. 

 

 

Level 1

Re: Early stopping rules for A/B testing

Hey Stefan,

Great question! You are correct that Optimizely could show that a variation has reached statistical significance before there is actually enough statistical power to know if that determination is true. For this reason, we encourage customers not to regularly look at the results until a certain amount of traffic has been reached for the test. To determine the right number of traffic needed, what we do provide is a Sample Size Calculator (look under Resources from the Optimizely homepage). If you have not already seen it, take a look at this article:

https://help.optimizely.com/hc/en-us/articles/200133789

Currently we do not have any such logic to look for a higher significance level early on and then bring it back down towards the end of the experiment. So it is important to take into consideration your sample size and the idea of statistical power. That being said, our team has been working on a revamped statistical model to allow users to make decisions based on the data they see without comparing it to the sample size calculator. I don't have an exact date for when that will be launched, but it should be coming out in the next few months so keep an eye out for it!

Thanks!
Cyrus

Optimizely
jshklt 12-01-14
 

Re: Early stopping rules for A/B testing

If everything leading up to the test was done correctly (sounds like you're on top of it, statistics-wise), I'd suggest always leaving it until you get your desired sample size. Like Cyrus said, the best thing you can do is to just set it and forget it for a bit while your results populate - easier said than done though Smiley Happy

This can be a little harder to follow further up the funnel, where your sample size can be pretty huge (if you're doing a homepage test with conversion goals, for example), so personally I'd just make sure you're comfortable committing to testing a potentially losing variant through to completion before setting anything live.

Really curious to see what others have to say though, as this is something I've wondered as well.
Level 2
kylerush 12-05-14
 

Re: Early stopping rules for A/B testing

Cyrus has a great response to this question. The only time I stop an experiment before it reaches the sample size is if a confirmed bug in the variation or control is impacting the results. I once tested two donation APIs against each other and there was a bug in one of the APIs that prevented 50% of the donations from being processed. This was obviously brought light when the experiment became significant very early on. Once we were able to confirm the existence of the bug we stopped the experiment, patched the bug and then restarted the experiment.
Optimizely