A/B tests for developers

Adrian B.G. - Jan 4 '19 - - Dev Community

This article is only the first part of the main story from my blog: A/B tests developers manual.

Best case scenario: Your [product owner,boss,producer] found out about A/B tests and you are here to learn more about how to implement them.

Worst case scenario: Your product is already a mess because of A/B tests, and you want to clean it up.

Either way, I’m writing this article so you do not have to repeat our mistakes.

In the last 5 years I worked mostly in the gaming industry. I had to implement hundreds of A/B tests and I learned it’s a powerful 💪🏾 tool. In the same time I learned that if you do not pay enough attention, your code transforms in a spaghetti 🍝restaurant.

I wish there would be a single simple 🎯way to implement A/B tests without making a mess in your code,but I don’t know any. By definition your code needs to have multiple versions of the same behavior.

Intro ⚓

Skip this block if you are already familiar with A/B testing.

A/B test is also called multivariate testing, A/B/C/D testing, split testing or bucket testing. It is an iterative process of experimentation, that helps you find out what is better for your product. More formal definitions: here and here.

Your product (game, app, website, shop …) can grow in 2 ways:

  • a person says “feature X will improve the Y KPI by 30%”, and you implement this feature. We, the mortal humans cannot predict the future, we can only guess it.
  • a person says “this X feature is the best”, other says “Y feature is the best”. You implement X,Y,Z variants and measure exactly which one is better.It may be none, one or more of them. You keep the best versions and improve them with further split tests.

The tests are done on smaller, but representative samples of users. This way you can test multiple versions in parallel and mitigate the effects. You do not know how it will affect the user behavior, so you want to minimize a possible damage.

You start by distributing your users into buckets, each bucket will provide a different user experience and you collect the data and analyze the impact of each version. You choose the best bucket and roll-out it to all the users. The process is more complex than this, but this was the main idea.

Basic example: you want to find out the price point of a new product you can make a test. 50% of the users are left out of the test, with $5 price. The remaining 50% are split equally into 5 A/B test versions ($5 control group, $10 version 1, $15 version 2, $20 version 3, $25 version 4).

Professional commitment ✍

I, the developer, swear not to be biased. We cannot allow any personal/technical difference or issue to affect the A/B test result (user behaviors) for example:

  • loading— make sure all the resources are loaded from the same source (CDN/hosting), so the network times are similar for the users
  • size— make sure the file sizes are the same for all the versions, 1 button has 1mb background and the rest are 50kb

…you get the picture. All the users must start from the same premise. If a technical irregularity appears please let the team know and repeat the test.

We, the developers are in charge of the implementation and technical details, we must guarantee the technical unbiased and act as a firewall.

"I, the developer, swear to collect the right data from the users and do not mess with the tracking." Easy to say, hard to do. The main idea is that the A/B test result is based on the KPI events, by observing users behavior during the split test. Data anomalies can mean “bugs” or a clear “winner”.

Usually when something is too good to be real, it is a bug.

A. Good to have 🛠

It’s easier to work with a better codebase. If your code is already a mess, the A/B tests will multiply it.

Modules If the code is already split to modules (modules, files, classes) most likely you will only need to modify 1 portion of your code, if not … then you must like spaghetti.

Parameters and configs. Everything in an A/B test is reduced to a parameter value, usually a string value. If your business logic has already the tested parameter as a variable or config value its very easy to implement the test.

No magic values If the parameter you are testing is hard coded in multiple places let this be your learning, do not use magic values (magic numbers are the most common mistake).

...

To read the full story you can continue on my website: A/B tests developers manual.

Remember to share and subscribe if you learned something new, Thanks!

. . . . . . . . . . . . . . .