A/B Testing: Experiments in campaign messaging


Originally published in Inside the Influence Industry's Personal Data: Political Persuasion - How it works by Tactical Tech's Data & Politics team.

What is A/B testing?

When Barack Obama's 2008 presidential campaign team was having trouble converting web visitors into subscribers, they took a page from commercial marketing's playbook and decided to change the text on their website. They tested three different messages against the site's usual 'Sign Up' prompt: 'Learn More,' 'Join Us Now' and 'Sign Up Now.' They found that 'Learn More' outperformed the default message by a whopping 18.6%. When they tested the prompt alongside six different photo and video options, the winning combination boosted their sign-up rate by more than 3 percentage points. While this number may seem small, the campaign estimated that this single change contributed to nearly three million new email address sign-ups and netted US$60 million in new donations. Four years later, the Obama re-election campaign ran over 500 similar A/B tests across web and email in 20 months, increasing their donation conversion by 29% and their sign-up conversions by 161%.

A/B testing, sometimes called split testing, compares two or more variants of an advertisement or message to determine which one performs best. Campaigns commonly experiment on their donation pages to boost contributions. In 2016, Ben Carson's US presidential campaign ran an experiment to find out whether giving away a copy of Carson's book or a campaign hat yielded more donations. By randomly directing website visitors to either the book donation page or the hat donation page, the campaign could measure which offer was more successful. If the cap was found to be more successful, the campaign could have run another experiment pitting the cap against, say, a tote bag; in this way, they could continue optimising the website.

5_19-us-state-voter-list

An article published on medium.com explores how Ben Carson’s 2016 presidential campaign tested whether a book or a hat was a more effective gift in soliciting donations to his campaign. The test took place on his website, BenCarson.com, which was no longer active at the time of writing.
Source: Medium, accessed 11 March 2019.

Though digital A/B testing is common in tech and campaign circles today, the method has a long analogue history going back to the 1920s, when the British statistician Ronald Fisher formalised its basic mathematics while testing crop growth by applying fertilizer to one plot of land and withholding it from another. Since then, A/B testing has been integrated into politics and has become part of standard campaign practice for websites, emails (subject lines, bodies), design elements (images, backgrounds, buttons), headlines, direct mail, TV, radio, phone and even texting to 'find the right messaging'.

A number of services have made A/B testing easy to run for political campaigns, allowing them to test multiple changes simultaneously.

How is your data used?

Campaigners rely on personal data in both the setup and evaluation of A/B tests. First, they use it to select who qualifies for a given experiment. If a campaign were interested in mobilising working mothers in a swing district or boosting rally attendance in another, for instance, it could launch experiments using address information obtained from voter files, a data exchange or another source. As long as a campaign has the relevant data and a sufficient number of individuals for a statistically valid experiment, nothing is off-limits for experimentation, pursuant to local laws.

A/B testing also relies on personal data to track responses to experiments. If you receive an email from a campaign, for instance, the campaign is likely tracking email open and click-through rates to determine if you engage with it or not. That data can be mined even further: if you unsubscribe from the email list, perhaps you will be considered less likely to vote for the candidate in question. If you consistently open campaign emails promptly, the campaign could deem you a promising volunteer.

Some examples

In the UK: Dominic Cummings, campaign director of Vote Leave during the UK's 2016 EU membership referendum (also known as the Brexit referendum), described how the Leave campaign used personal data and experimentation to help them win. According to Cummings, by surveying voters in the UK, campaign data scientists were able to do things like 'target women between 35 and 45 who live in these particular geographical entities, who don't have a degree or do have a degree... We essentially ran a whole series of experiments... out in the digital world and filtered what worked'. The Vote Leave campaign split voters into three groups: those firmly voting remain, those voting leave, and those on the fence. Vote Leave invested 98% of its marketing budget in digital efforts focused on this third group and tested five narratives on them. The winning message was 'take back control'. Research suggested that including the word 'back' triggered voters' anger and dislike of losing things they felt they once had - in particular, control.

5_19-us-state-voter-list

This screenshot from Facebook’s ad archive shows two political advertisements against Brexit used the same image but different text. The ad on the left was shown to users less than 1,000 times, while the ad on the right was shown between 5,000 and 10,000 times. No metrics are available whether one garnered more clicks than the other.
_Source: Facebook Ad Library, accessed 22 February 2019._

In the United States: Some political campaigns in the US are using A/B testing at a staggering scale, even when compared to private companies. Nowhere was this more clear than in Donald Trump's 2016 presidential run. Gary Coby, the director of digital advertising and fundraising for the Trump campaign, called their use of experimentation 'A/B testing on steroids'. The campaign reportedly ran 40,000 to 50,000 variants on a given day, and these experiments proved to be lucrative. As Michael Babyak, former director of marketing technology at the Republican National Convention claimed, 'The RNC Performance, Optimization & Experiments Team... ran over 300 tests on DonaldJTrump.com from July through November 2016, generating over US$30 million in added revenue'. The team found that pro-Trump messages always 'beat out any anti-Hillary or otherwise negative copy'. Well after the election, in May 2018, Coby declared on Twitter that the team still had over 4,000 ads active for 'testing and learning', extending the campaign's intelligence-gathering activities beyond the election.

5_19-us-state-voter-list

A screenshot from the RNC Testing Booklet posted on www.scribd.com shows how Donald Trump’s campaign tested these two background images against each other on its donation page. The image of Trump performed about 80% better than the image of Clinton.
Source: RNC Testing Booklet, accessed 7 January 2019.

How do I know if it's being used on me?

You have almost certainly been part of an A/B test. As Christian Rudder, president of OKCupid, wrote in a blog in 2014: 'Guess what, everybody: if you use the Internet, you're the subject of hundreds of experiments at any given time, on every site. That's how websites work'. Another commentator observed, 'every product, brand, politician, charity, and social movement is trying to manipulate your emotions on some level, and they're running A/B tests to find how out'. A/B testing is now standard practice among virtually any entity with an online presence. While you may be able to identify experiments in which you are participating by inspecting hyperlinks or by analysing your third-party cookies, there is no way to comprehensively know in which political campaign experiments you were included.

5_19-us-state-voter-list

These ads, from Facebook’s Ad Archive, encourage Indians to organise a get-together to listen to Prime Minister Narendra Modi’s address to the nation. The sign-up messages are identical, but the images differ slightly. All three ads cost less than 15 USD and were seen between 10,000 and 60,000 times.
_Source: Facebook Ad Library, accessed 11 March 2019._

Considerations

↘ A/B testing allows campaigns to test their assumptions and avoid deferring to HiPPO (the Highest Paid Person's Opinion), a derisive term describing the standard decision-making process. If a political message is tested properly, it has the potential to debunk faulty assumptions.

↘ As one expert observed, 'taken to its logical conclusion, this trend could lead to a stream of unique, personalised messages targeted at each voter constantly updated based on A/B testing'. That A/B tests can be selectively targeted and tweaked for personal appeal risks undermining public understanding of political issues and opens the door to more manipulative tactics.

↘ As A/B testing services become more automated, algorithms can create far more variants and combinations of text, media, buttons, etc. based on campaign inputs. This ostensibly means that machines - instead of people - would decide what a potential voter reads and sees, which could set a precedent of creating personalised political content free of human oversight.

5_19-us-state-voter-list

A/B testing is moving towards algorithmic generation of variants. Using data to create the most compelling ad for a given user, algorithmically-generated variants allow computers to decide what users see by customizing different ad creatives for different individuals. This screen-shot was taken from promotional ‘dynamic creative’ product video by Facebook, a popular experimentation platform for political campaigns. The voiceover audio explains that advertisers supply images, video, text, calls to action, budget and target audiences, and the product will decide which combinations work best with any given audience.
Source: Facebook Dynamic Creative, accessed 7 January 2019.

↘ If an A/B test demonstrates a desirable and sizeable impact, what of the voters exposed to the 'losing' variant who may, as a result, be marginally less inclined to join a newsletter, to volunteer, to consume political news, or to vote?

↘ Voters are generally unaware of their participation in experiments; moreover, permission is often requested by privacy policies that users tend to accept without reading. As a result of this lack of awareness, there's no way for participants to opt out. Furthermore, many voters are unaware of the impacts that past experiments may have had on them.

↘ Political campaigns often run experiments on people without independent, ethical oversight.

↘ A/B testing could be exploited as a testing ground for politicians - a space to trial an idea and conceal it if it fails, or promote it if it works.

↘ A/B testing can save a politician from appearing undecided on an issue by testing different messages and trumpeting the winning variant. One writer observed, 'instead of seeking consensus or taking politically risky decisions, empirical data gained from A/B testing might provide the optimal solution: "Why debate when you can test?"

↘ A/B testing risks 'circumventing the reasoning process altogether in the search for what works,' diverting campaigns' attention from issues to button colours.

↘ A/B testing makes campaign monitoring more difficult. Instead of keeping tabs on one website, campaign monitoring groups may have to keep track of multiple variants of the same website.

Author: Varoon Bashyakarla