Best practices for getting Optimizely data at runtime (to combine with other data)
In the absence of a reporting API we're looking for ways to combine Optimizely data with data from other parts of our customer activation funnel (e.g. AdWords, Google Analytics, Mixpanel, etc.). And during a productive email dialogue with Optimizely support we were encouraged to share with the Optiverse. So here goes:
The Optimizely-GA integration won't work for us because of GA's 20 custom dimension limit (i.e. we need to test more than 20 distinct landing pages). So we figured out a way to hack around the limit by mimicking Optimizely's reporting data tutorial.
This would allow us to use GA custom dimensions and pass the `activeExperimentData` JSON string as the dimension's value. We'd select hit-level scope, and in reports we could segment on LP URL, which would allow us to re-use the same custom dimension on different LPs in the same "slice" of the activation funnel without having to worry about overwriting data. (Though one potential issue is GA's 150 byte limit for values, which we might bump into for pages running 5 or more active experiments and variations. And another is running MV tests because the experimentID array can have multiple values.)
We could also pass the `activeExperimentData` JSON string as a Mixpanel super property (for A/B testing and funnel analysis), but we're not sure how we could (or even if we should) use MP to analyze Optimizely experiments/variations as part of MV tests across different sections of the activation funnel. (e.g. impact of bid modifiers -> a/b test ad copy -> optimizely LP variation -> email campaign -> etc. on customer signups/conversions. To note, we'd probably have similar challenges with GA, but auto-linking with AdWords helps.)
And yet another idea is since we're already using GA's measurement protocol (HTTP endpoint) to fire offline (server-side) events, we could create a table for Optimizely in our database and use the GA client-ID as a foreign key. Then whenever a customer signs up/converts we could automatically update the corresponding Optimizely record. We wouldn't have to worry about a limit on dimensions or bytes per entry, and we could join/export data as we see fit.
It's starting to look like rolling our own solution could work best for our particular situation, but before we move forward we're keen to find out what others are/thought of doing, what works, or what doesn't.
Feedback, questions, ideas are all welcome. Please feel free to share. Thanks!
Kind of love these geeky ideas on how to integrate all your tools together to get the best insights in your customer data.
As we're not Mixpanel I can't help you with that, but hopefully I can give you an answer on the Google Analytics side. What we're doing is saving the experiment name + the current variation in a custom dimension in Google Analytics (as a hit). So we know what the user is seeing on what page. What we're currently trying to find out as well is seeing how we could do that in a multivariate way. What do you think of combining the names of the experiments + variations in a name?
As all of our users actions are saved with event tracking so we can see how users are performing better or worse in a test.
Curious thought about which solution you provided in your initial question has your interest. You really have to go hardcore to get all your information in a database and make sure it's clean and usable in my opinion.
What do you think of combining the names of the experiments + variations in a name?
That might work, but a few issues come to mind:
1. GA's 150 byte limit for values
2. Human readability
3. Programmatically analyzing results of Optimizely MVTs via GA
WRT #'s 1 and 2, it feels like this type of nomenclature could easily become unwieldy.
Thus I would recommend either not going this route, or using some mechanism to automatically translate the names of dimensions to something human readable. (Perhaps this could be done in reporting views, or by creating a DB table which stores and maps the experiments+variations combos to their human readable counterparts.)
WRT #3 I honestly don't know if this is an issue or not, which is the exact issue for me. I would need to look into this more.
Curious thought about which solution you provided in your initial question has your interest.
So first off, I don't foresee a perfect solution; there appear to be trade-offs with each choice. (Also, it should be noted that in this discussion I'm focusing on the activation funnel.) However, I'm leaning toward a homegrown solution to act as the 'glue' for data from other services.
WRT activation funnel optimization, here's a ruby gem from Swiftype's Andrew Geweke we're evaluating .
WebServerUid could be used to create foreign keys to join data from Google Analytics (e.g. custom dimension, or client ID), AdWords (i.e. gclid), and with a little work could also capture and store Optimizely experiment and variation IDs at runtime. (This could be particularly useful to improve the experience of pre-signups using anonymous IDs.)
With these IDs in a backend DB it should be trivial to do cool things like fire offline conversion events for signups. I mentioned previously we're doing this with GA, but we also recently found out how to do this with Optimizely .
This requires storing experiment and variation IDs for a user/session anyway, so adding a UID and a few join tables isn't much extra work. (Nor is creating some background workers to pull in extra reporting data with those IDs.)
I think the trick is just knowing which data to collect for specific requests, and which queries to write (or questions to ask).
You really have to go hardcore to get all your information in a database and make sure it's clean and usable in my opinion.
In terms of cleanliness I don't see a lot of difference between doing something like this -- parsing cookies and URL params and passing them to a backend DB -- and capturing form field data and passing them to a backend DB.
And in terms of usability I don't see a lot of difference between dumping these data into CSV files and creating pivot tables in Excel vs dumping them into a MySQL DB and running queries. (Though I suppose it depends on setup.)
But perhaps we have different things in mind. Could you share an example of what you mean?
Feel free to let me know if you have questions/comments.