Bots have become a great scourge of the internet. Recently, they’ve flooded government comment systems with fake activism, distorted the national discourse on guns, and launched malicious attacks against the Justice Department. And a new study suggests they’re behind the majority of links shared on Twitter, too.
A Pew Research report released Monday finds that a whole two-thirds of links to popular sites shared on Twitter come from automated accounts. But these aren’t just those malicious Russian bots posing as the uncannily angry and active boy next door. They also include legitimate accounts belonging to organizations that schedule tweets through some kind of automated service. What’s more, the study finds, the majority of these accounts are not as politically polarized as headlines might make them out to be, nor do they primarily link to hyper-partisan websites. In fact, some of the websites receiving the largest share of links from bots are mainstream business outlets. Oh, and porn.
“Material being posted by bots or automated accounts is not just the province of niche publications,” says Aaron Smith, associate director of research at Pew. “It’s much broader and more pervasive within the ecosystem as a whole.”
To conduct the study, Pew researchers analyzed a random sample of 1.2 million tweets sent between July 27 and September 11, 2017, which it scraped from Twitter’s public API. The researchers then analyzed the top 3,000 websites those tweets linked to during that period, and divided them into six categories: adult content, sports, celebrity, commercial products or services, organizations or groups, and news and current events. Some of those links had since gone dead, winnowing the total pool of websites to 2,315. Finally, the researchers ran all of the Twitter accounts that were linking to each of those websites through a tool called the Botometer to determine what percentage of those links came from automated accounts.
‘Material being posted by bots or automated accounts is not just the province of niche publications.’
Aaron Smith, Pew Research
Developed by researchers at the Indiana University Network Science Institute and the Center for Complex Networks and Systems Research, the Botometer is a machine learning tool that analyzes 1,200 signals of any given account—including the profile and the timing of tweets—to predict whether a given account is automated. The Botometer’s developers note that “organizational accounts,” like say @BarackObama, are classified as automated. The Botometer does, however, give researchers the ability to tinker with its thresholds and decide how broad an estimate they want.
Twitter takes issue with third-party tools, including Botometer, which a company spokesperson says are often flawed, because they only have access to Twitter’s public API. Many of the signals Twitter uses to determine whether a given account is a bot are private, and not shared through the API.
Smith agrees that not all of Botometer’s assessments can be taken as truth, but in aggregate, he says, they paint a largely accurate picture. In this case, the Pew researchers started with a small sample set of accounts that their human data scientists assessed to be bots, and ran them through the Botometer to see what score it would give them. It used that score as the minimum threshold for what would be considered a bot as part of this report. Still, Smith acknowledges, “It is an estimate with some inherent uncertainty and a level of both false positives and false negatives.”
Using this methodology, the researchers found that 66 percent of all tweets linking to the most popular websites during that time period were automated. For sites featuring adult content, an eye-popping 90 percent of links were tweeted by automated accounts. But the most surprising finding for Smith was the fact that partisan political sites weren’t particularly prone to links from bots. These accounts shared just 41 percent of links to sites shared primarily by conservatives, and 44 percent of links to political sites shared primarily by liberals. By comparison, bots shared 57 percent to 66 percent of links to news sites shared primarily by ideologically mixed audiences.
The Pew researchers found that business sites were particularly popular with bots because so many automated accounts exist for the purpose of sharing news that includes stock tickers. That, Smith says, may explain why, for example, more than 75 percent of links to Forbes during that time came from bots. That’s a far higher percentage of bot links than either liberal sites like Mother Jones or conservative sites like Fox News received.
“It certainly runs a little bit counter to some of the narratives about bots right now,” Smith says.
The researchers stopped short of distinguishing between good bots and bad ones, or between accounts that intentionally misrepresent themselves and accounts like, say, @netflix_bot, which automatically tweets when new content has been added to the online streaming service. As a result, it remains unclear what percentage of Twitter bots perform a useful service and what percent are problematic.
Twitter is at least aware of its bot problem, though CEO Jack Dorsey has admitted the company was too slow in responding to it. “We have witnessed abuse, harassment, troll armies, manipulation through bots and human-coordination, misinformation campaigns, and increasingly divisive echo chambers. We aren’t proud of how people have taken advantage of our service, or our inability to address it fast enough,” Dorsey tweeted in March. The company did recently announce new restrictions, preventing third-party apps from allowing people to tweet from multiple accounts at once or perform actions like retweeting, liking, or sharing hashtags in bulk.
These new policies, which went into effect last month, may well alter the Twitter landscape Pew researchers observed in the fall. As Twitter and its users continue to rethink the value of automated accounts, Smith says, “We’re hopeful this will provide some additional context for those conversations that are happening and make it more apparent that this is not something that just happens in the context of problematic news or niche sites. It’s not a good thing or a bad thing.”