Case Study: Can AI Voices Persuade Voters?

We’re happy to bring you the results of a collaboration between Trilogy Interactive and Grow Progress to explore the effectiveness of AI-generated voiceovers for ads.

Summary

Grow Progress and Trilogy partnered to test conventional wisdom and enter largely uncharted territory by testing eight ads across two tests — one on the issue of prescription drugs and one on the issue of abortion. For each test, the video in each ad stayed the same but the voiceovers were changed. Half of the voiceovers were recorded by human men and women, while the other half were AI-generated male and female voices. Our findings — both in terms of the ads’ persuasiveness overall, as well as how gender-matched ads performed with men and women — are a promising springboard for further testing.

Artificial intelligence seems likely to change campaigning in ways both big and small, and we’re just beginning to explore this strange future. When Trilogy brought us the idea of experimenting with AI voiceovers, it seemed like a good first step into that world. Future tests could explore AI-generated messages, images, scripts, audio, video, and whatever else the next version of ChatGPT supports.

Hypotheses

Heading into this research, we’d heard the following assertions from various strategists, clients, and the political media, and we wanted to put them to the test:

Men are generally more persuasive political messengers
Men are generally more persuaded by men, and women by women
Women are more-effective messengers on the issue of abortion

There’s no conventional wisdom about the effectiveness of AI-generated voices that we’re aware of, but our hypothesis was that they wouldn’t be as effective as humans. Each of the ads tested was an emotional appeal, and getting the tone right and connecting with the listener struck us as the kind of challenge that AI was unlikely to excel at.

Methodology and process

Trilogy and Grow Progress conducted two tests, each with four variations of an ad on a single theme.

Trilogy developed four versions of an ad about drug prices, and four versions of an ad about abortion. In each case, they kept the video content constant, but varied the voiceover, recording versions with a human male, a human female, an AI male, and an AI female.

Each ad was designed with multiple goals in mind:

Persuade people to vote for Democrats
Persuade people to trust Democrats more on the issue featured in the ad
Convey to people Democrats’ position on the issue featured in the ad

Trilogy used our Rapid Message Testing tool to measure the effectiveness of each of those eight ads against those three persuasion goals. As with all our Rapid Message Tests, this research consisted of randomized controlled experiments, where each respondent was randomly assigned to see one of the ads, or to see a non-political placebo ad, before being asked questions about their opinions and beliefs.

Though creating content can sometimes be time-consuming, the research itself was executed very rapidly. The test on the issue of drug prices launched on Friday, August 11, and returned results that same day. The test on the issue of abortion launched on Monday, August 28, and also returned results by the end of the day. Both tests were conducted with a representative sample of US adults.

Results

Test 1: Drug prices

The ad in the first test showed viewers video of the hands of a person cutting a pill in half, while the voiceover told viewers that Senate Republicans are blocking Democrats’ efforts to lower drug prices, and commented on the effect of high drug prices on their own life. In fifteen seconds, it draws a clear connection between federal action or inaction on the issue of drug prices to the struggle many face to afford life-saving drugs.

This ad was generally effective, with multiple versions persuading people to vote for Democrats, persuading people to trust Democrats more on the issue of prescription drug prices, and increasing the association people had between Democrats and reducing prescription drug prices.

But not all versions were equally effective. We saw that the version with the male AI voiceover was the most effective across all three outcome questions we measured:

	Democratic vote choice	Trust Democrats on prescription drugs	Associate Democrats with reducing drug prices
Human Female	+9	+9	+16
Human Male	+7	+7	+15
AI Female	+1	+4	+7
AI Male	+9	+13	+17

The human female voiceover was only slightly behind the AI male, while the human male was only slightly behind the human female. The AI female voice was clearly the least effective of the four in this test.

The test results also suggest that people were more persuaded by a voice that matched their own gender. The ad with the AI male voiceover had the largest effects among men, while women were most persuaded by the ad with the human female.

Test 2: Abortion bans

This ad had a much hotter tone, pointing out that Republicans are working to ban abortion nationwide, playing clips of DeSantis, Pence, and Trump promising abortion bans, and encouraging the viewer to vote against Republicans.

This ad was not generally effective on average, though we’ll discuss effects among subgroups of the population.

	Democratic vote choice	Trust Democrats on abortion policy and laws	Associate Democrats with women’s bodily freedom
Human Female	-3	-4	-4
Human Male	+4	+4	+1
AI Female	0	+2	+2
AI Male	-2	-3	+1

You can see that on average, the ad with the human male voiceover was slightly more effective, but most of these results are not statistically significant, which means we can’t confidently discern the ads’ effects among the statistical noise.

Looking at the results by gender, a more nuanced story emerges.

Among men, each ad reduced support for Democrats, including statistically significant backlash effects for the AI male and AI female voiceovers.

Among women, the ads increased support for Democrats, with the ad narrated by a human male causing the greatest increase. The ad narrated by a human female had a very small effect that was not statistically significant.

	Democratic vote choice among men	Democratic vote choice among women
Human Female	-5	+1
Human Male	-6	+15
AI Female	-11	+11
AI Male	-14	+10

Discussion

How did the hypotheses that we started off with hold up?

Were men more persuasive messengers than women across these tests?

Not clearly — while the AI male was more persuasive than the AI female on the drug prices test, the human female was slightly more persuasive than the human male. On the abortion test, none of the effects was particularly dramatic, and the human male was more effective than the human female, but the reverse was true for the AI voices.

Was matching the gender of the voiceover to that of the audience effective?

For the drug prices test, yes. For the abortion ads, there’s no clear pattern. The voiceover appears to matter — for instance, the ad with the human female voiceover was ineffective among women, while the ad with the human male voiceover was effective for women — but not in a consistent way that we could predict.

Were women more persuasive messengers on abortion?

This wasn’t the case for this test. Not overall, not among women. It might be the case on other messages, or it might be the case that men are more persuasive messengers on abortion for some messages, but this test didn’t provide any evidence to support this hypothesis.

Further research

We share this research with the community with the hopes that others pick up the questions we explored and dive deeper. Running these tests and exploring the results raised many interesting questions to ask in future research.

1: How does emotion affect voiceover effectiveness?

While the voiceovers varied in gender and whether they were recorded by a human or a robot, they also varied in the emotion they conveyed. A given message might be more or less effective depending on the emotion carried in a narrator’s voice. Future tests should aim to isolate this by using the same narrator on several ads, but exploring several different emotional approaches in the voiceovers.

2: How does the age of the narrator interact with issue content?

Often, the relevance of a political issue is connected to age. For instance, policies around education most directly affect people with school-age children, while policies around prescription drugs and medical care affect a higher proportion of older people.

It might be the case that, for a given issue, a messenger from the age group affected by that issue is more credible or convincing. Perhaps a young person, not even of voting age, could be an effective messenger on an issue that affects children.

3: How does the gender of the narrator interact with issue content?

As with age, some issues may be more important to people of a certain gender, or a voiceover may simply be more effective if delivered by a member of a particular gender.

Conclusion

In this case study, we explored the effectiveness of using AI-generated voiceovers for political ads on two issues: drug prices and abortion. We tested four versions of each ad, varying the gender and the source (human or AI) of the voiceover. We measured the effects of the ads on three outcomes: voting for Democrats, trusting Democrats on the issue, and associating Democrats with a particular issue position.

We found that the voiceover had a significant impact on the persuasiveness of the ads, but not in a consistent or predictable way. The AI male voiceover was the most effective for the drug prices ad, while the human male voiceover was slightly more effective for the abortion ad. We also found that matching the gender of the voiceover to that of the audience was effective for the drug prices ad, but not for the abortion ad. We did not find any evidence to support the hypothesis that women were more persuasive messengers on abortion.

These results challenge some of the conventional wisdom about political messaging and suggest that AI voices can be a viable option for creating persuasive ads. However, they also raise many questions for further research, such as how emotion, age, and issue content interact with voiceover effectiveness. We hope that this case study inspires more experimentation and innovation in this field.