Best tool for organizing tests - what do you use?
Before you can prioritize which experiment to run next, you need hypotheses—predictions you create prior to running an experiment.
Once you have ideas, you want to see them all in one place to start prioritizing. Call this your repository, backlog, library—it’s the place where experiment ideas live.
Below I share some tools that I've seen my customers use, but I am sure there are many more out there I've never heard of.
What does your team use? Do you have any favorite templates you'd like to share?
- Pros: Very simple, shareable, visual way to manage tests at different stages of development.
- Cons: I’m less familiar with Trello, but from what I’ve seen, it seems naturally less robust and might work better for ‘per-test’, as opposed to ‘program-level’ documentation.
- Pros: Robust, configurable system that is directly connected to development team. Great for managing test submissions, has some charting built in.
- Cons: Not very user-friendly, can add layers of friction to non-developers; might also leave you ‘fighting the queue’ if your development team is taking on tests as tickets competing with other work.
- Pros: Great for task management and set of sub-tasks, social assignment features.
- Cons: Might best serve as a social task management tool on top of your core roadmap (just like I’ve found Basecamp & Trello), not intended for sophisticated formulas or charting.
- Documents (gDoc, Word, etc)
- Pros: Ultra simple, flexible. Quick to make.
- Cons: No dynamic functionality, not great for task management; good for individual test scoping.
- Spreadsheet (Google, Microsoft)
- Pros:; Ultimately flexible, dynamic functionality, charting built in; the de facto choice for most professionals.
- Cons: Might require more thought and effort to set up, will require maintenance and controls for sharing; desktop-based versions not easily shareable.
- Pros: Simple, social. Boards are well suited for high number of items. I love the ease of use.
- Cons: Like Trello, Basecamp may be good for individual tests, but lacks the depth of charting or calculation functionality for more sophisticated schemes
- …or a great project management tool out there that I may not have heard about. What does your team use?
First person to respond in the comments with evidence of a custom application for managing tests wins an Optimizely shirt
This is amazing Hudson! I also wanted to point to a wonderful discussion from awhile back in Optiverse where many different ideas were surfaced: https://community.optimizely.com/t5/Strategy-Cultu
@ben mentioned that he used Jira and Confluence, @MJBeisch uses salesforce, and @Jacob mentioned that his team uses Javelin Experiment Board (https://experiments.javelin.com/), which is based on the lean startup model. Perhaps you each can expand on how you use these tools here?
Not sure what you mean by "custom application"... surely there's no good reason to reinvent the wheel for the 10th time, unless you're testing at a massive scale.
Personally I use Trello to plan and organize testing for several organizations. Each board is tailored to the organization, but generally here are the lists I use:
- Ideas - Various hypotheses and ideas that may or may not be worthwhile to test.
- First Review - Before I start developing tests, I want to make sure the team is onboard and that I can get additional resources as needed (development time, design time, buy-in from others, etc).
- Development - Ideas that get approved go into development.
- Final review - Final review of experiment before it goes live.
- Live - La la la la la la...
- Completed - Review outcomes and lessons learned. Decide whether change will be implemented to the production site or not. In any case, the card gets copied to my "Completed Test" board, where I save completed tests from every organization for easy reference later (for example, if I'm writing a case study or blog article).
While my team uses Google Spreadsheets to intake and organize testing ideas, we use the PIE method to prioritize. PIE (prioritization, importance and effort) is a method based on weighted average. It's simple but it's more subjective which is why I pair the PIE score with an ROI or LTV calculation for added insight. So what test wins? High PIE score with high ROI or LTV!
A link with instructions to our template can be found here (or you can download it):
We added a column called Social Proof but you don't have to!
Product Marketing @ Auction.com
Thanks for this very informative post, Hudson! There are many great nuggets to take away.
Not that my opinion matters, but we currently find Google Spreadsheets to be the most effective for several reasons. Mostly because Google Apps is frequently used within the organization, and our stakeholders don't easily adapt to new tools.
(I should mention, we've found transparency and stakeholder (executive) accessibility to be very important, and Google's sharable links come in handy.)
We also use Google Spreadsheets because it allows us to easily add sortable data points on the fly to ideas.
Here are several data points we take into consideration:
- Page Type (Landing, Home, etc.)
- Traffic Source (Search, Paid, etc.)
- Device Category
- Type of test (incremental, pricing, etc.)
- Experiment name (with hypothesis as a cell note)
- Formula Scoring Metrics (design effort, development effort, Revenue, Strategic Position, etc.)
These data points allow the team and stakeholders to easily sort and digest the backlog of prioritized ideas across our entire site.
Moreover, within the same spreadsheet we have our "archive" tab, where these data points (when moved over to the archive tab from the idea tab) come in handy for post experiment stakeholder curiosity (i.e. "What pricing tests did we run within a specific user flow in Q3 of 2012?").
We also insert new data points for each experiment within our archive tab. Here's a few:
- "winner", "loser" or "inconclusive"
- experiment results link
- start and stop dates
But archiving methodologies are for a separate post.
All in all, each optimizer's situation is different, and this seems to be what works best for us (for now). That said, after reading your post I'm sure we have some areas where we could... optimize.
Thanks again for sharing your great ideas.