Rich-text Reply

Statistical significance is higher for the smaller improvement

paulkoch 05-02-16
Accepted Solution

Statistical significance is higher for the smaller improvement

Hi All,

 

Running a revenue-based multivariate test, I see that two versions are outperforming the baseline.  The version with the smaller improvement percentage has a higher statistical significance, which I wasn't expecting.

 

For the version with the higher improvement/smaller statistical significance, is a potential explanation that it has received a few big outlier orders?  Are there other explanations I'm not thinking of?

 

Thanks for your help! 

Level 2

David_Orr 05-03-16
 

Re: Statistical significance is higher for the smaller improvement

Hi,

May I ask if any of the variations reached significance? I would not recommend making any conclusions until stats engine determines that the conversions it's calculating are significant. Here is a link to an article that may help explain why stats engine would not determine the conversions it received is significant: https://help.optimizely.com/Analyze_Results/Stats_Engine%3A_How_Optimizely_calculates_results_to_ena...

May I ask if there are other goals? I generally suggest building your goals around your hypothesis. If the hypothesis is to generate more orders, than a pageview goal on the confirmation page may produce better results. Did you experiment happen to have this goal?

David
Senior Technical Support Engineer
Optimizely
paulkoch 05-03-16
 

Re: Statistical significance is higher for the smaller improvement

Hi David,

 

Thank you for the response.

 

No, the test has not reached statistical significance yet. We aren't making decisions until the tests have reached a threshold of confidence. Are you implying that the statistical significance number isn't accurately reported when it's below the threshold, or just that we shouldn't be make making decisions based on a lower statistical significance?

 

Our goals and hypotheses relate to generating more revenue per visitor, so revenue is our primary goal.

For more context, here's what doesn't make sense:

  • Version 1: +9.7% improvement in revenue per visitor, 27% statistical significance.
  • Version 2: +4.3% improvement, 83% statistical significance.

The only explanation I can think of is that Version 1 had a few large outlier orders. Would this be a reasonable explanation, or can you think of any others?

 

Thanks!

Level 2
David_Orr 05-03-16
 

Re: Statistical significance is higher for the smaller improvement

Hi paulkoch,

 

Are you implying that the statistical significance number isn't accurately reported when it's below the threshold, or just that we shouldn't be make making decisions based on a lower statistical significance?

 

I apologize for not being clear. I did not mean to suggest a decision based on the statistical significance results. A low statistical significance percentage means that there's still a chance that the results are false positives. EG: 27% Statistical Significance would have a 73% error rate( Statistical Significance - 100 = error rate). 

 

The only explanation I can think of is that Version 1 had a few large outlier orders. Would this be a reasonable explanation, or can you think of any others?

 

I would also make that assumption that there may be outliers that's causing this variation to not reach significance. You should be able to find spikes by changing the revenue goal's chart to display "revenue." Please see the attached image. 

 

If the changes in the variations doesn't involve changing the pricing layout or pricing, I would recommend setting up a pageview goal on the confirmation page to give you a variation generated purchases. 

 

Lastly, I recommend taking a look at our Ideate and Hypothesize section of our Optiverse page for tips on how create experiments: https://help.optimizely.com/Ideate_and_Hypothesize

 

David 

Senior Technical Support Engineer
revenue goal graph.jpg
Optimizely
paulkoch 05-03-16
 

Re: Statistical significance is higher for the smaller improvement

Hi David,

Thanks for the reply. Why do you suggest using a pageview-based goal instead of a revenue-based goal? Will it usually reach an acceptable level of confidence more quickly?

We are making changes that we expect will influence average order value, and not just the order rate, so we're hoping we can still have revenue as a reasonable primary goal.

Thanks,
Paul
Level 2
David_Orr 05-03-16
 

Re: Statistical significance is higher for the smaller improvement

If the experiment involves pricing then having a revenue goal would be fine. The pageview goal on the confirmation page can be included in addition to the revenue goal. It will give you a broader view of the effects of the experiment. But given that, you mentioned the experiment is to measure average order value, you may disregard my recommendation. 

Senior Technical Support Engineer
Optimizely