Results Swinging After a Week
I was just wondering what experiences people have had with results swinging very suddenly after a week.
We've run several experiments where the results have been very promising at first, and have shown the variation with consistent positive results for 6-8 days. The bosses get very excited about this conversion rate swing, and look forward to implementing it sitewide at 100%.
Then, suddenly and equally consistently, the results reverse! All at once, the original takes the lead and stays there. I'm left having to explain what happened, which I don't know.
I'll caveat this by saying that our changes haven't been massive, but we get tons of traffic and should get statistical significance within that first week.
Has anyone else had this happen to them? It would be interesting to hear other tales.
It can depend very much on your traffic type and levels of traffic.
Do you see large swings in traffic source, for example, bursts of email newsletter traffic or PPC traffic to your pages.
Does your content differ weekly?
What is the niche/goal of your website and how many visitors are going through your test on a daily basis?
With a bit more information we might be able to align our own experiences to help you prove whether what you are experiencing is correct.
Use a sample size calculator to determine the required sample size, then use your site traffic data to determine how many days (or weeks) that will take, and implore your team not to check the results before that time is up.
Datawise, you need to look at several factors when analyzing experiment results:
- sample size
- experiment durration
- statistical confidence
However, it's not enough to look at the snapshot of current values for data points. You really need to look at data over time. How long has that statisical confidence level stayed over a 90%? Is it trending up or down? Also, you need to look at the context of the timeframe over which you're running the exeriment. Is it a heavy traffic season, or maybe slow? What items in the companies marketing cycle line up with this period? Did an email or catalog go out that caused a new visitor traffic spike?
Result analysis very much depends on understanding a site demographic(s) during the experiment's runtime period. Sure, you may have gotten enough traffic for a statistically significant result (as determined by a sample size calculator), but are the actual visitors within that sample size broad enough in their demographic to make a determination? For example, you run a test in the last week of September. However, the mailing cycle for your catalog's site is that it gets delivered in the first week of each month. That experiment probably capturing traffic being driven to the site by those catalogs. Would those visitors react differently to the changes being made? You don't know because you didn't run the experiment during a period of time that they'd likely be on the site.