I’m happy to finally give Sifter a name and a home. It’s a multi-armed bandit (MAB) API for testing web pages, optimizing which version (“arm”) of the test gets displayed. These arms could be article titles, ad positions, logos, colors, etc. Here’s the gist of how how a MAB test works:
- A user visits a page that is under test.
- The page makes a request to sifter asking for an arm to display.
- Sifter makes a calculated decision about which arm to display and returns the result to the page as a value between 0 and (# arms - 1).
- The page renders the selected version (specific colors, logos, layouts, etc).
- The user interacts with the page in one way or another. Maybe they sign up, maybe they buy something, maybe they don’t.
- When the interaction is complete, send the reward earned (if any) back to Sifter. This could be something binary like click/no click, or some other value such as the amount of money spent.
- Sifter updates the bandit algorithm, influencing the arm that is selected the next time the page is rendered.
There is an endless number of uses for something like this. Here are a few off the top of my head:
Run a news website?
- Run a few different title options for an article on your front page.
- Figure out how to get users to read more of a paginated article:
- Test the number of words shown on each page (say 500 or 1000)
- Each time the user clicks “next page”, update the default test result value to the user’s progress through the article. (See the
/select_armroute in the docs for more on default values)
Run a website with promoted content?
- Allow advertisers to choose multiple promotions to run at the same time, and optimize which one gets shown based on clicks and/or user feedback.
- Test the amount of labeling around the fact that this content is promoted vs organically created by users.
Test the audience demographic response.
- Set up a standard A/B test.
- Display the same content to every user.
- Report back the results of the test where the “arm” represents a specific user demographic, rather than an alteration to the web page.
- Your test results will show which demographic responds best to your content.
There are a lot of features that I think are pretty clever, like setting a default test result value and a TTL for the test, so that the default value gets reported when the test “expires,” and the ability to update this default value and TTL by sending a “heartbeat.” And there’s a lot on my To Do list, such as confidence intervals, multivariate testing, bucketing/aggregating test results for high-traffic websites, and more.
Right now the project is in a limited beta mode. I’m running it on a few pages that don’t get much traffic and I would love some help working out whatever bugs may come up. If you’re interested, please get in touch!