Why does Google’s ad network think people with names like Latanya and Rasheed have an arrest record?
By Andrew Leonard
Harvard University’s Latanya Sweeney discovered an odd thing after searching for her own name via Google last year. An advertisement from an outfit called Instant Checkmate appeared, featuring her name followed by the word “Arrested?”
For a subscription fee, Instant Checkmate searches public records for criminal records. Sweeney paid the fee, but found no evidence that she herself had been arrested.
A specialist in online privacy, Sweeney was intrigued and suspicious. So she created an experiment to test whether Google’s AdSense technology was delivering different advertisements depending on whether the name being searched for sounded black or white. Were searches for names like Latisha or Rasheed returning advertisements containing different language than searches for Greg and Meredith? (I first learned about Sweeney’s research from an article in Technology Review.)
Her results were conclusive: at one host of Google AdSense ads, she writes, “a black-identifying name was 25 percent more likely to get an ad suggestive of an arrest record.” According to Sweeney, “there is less than a 0.1 percent probability that these data can be explained by chance.”
In her paper, Sweeney doesn’t answer definitively the question of who exactly is to blame for the clear pattern of racism she uncovered. But the most disturbing implication of her research is the possibility that neither Instant Checkmate nor Google did anything wrong on purpose. Google’s AdSense algorithm may just be automatically reflecting society’s built-in racism.
Here’s how AdSense works, according to Sweeney.
Google understands that an advertiser may not know which ad copy will work best, so an advertiser may give multiple templates for the same search string and the “Google algorithm” learns over time which ad text gets the most clicks from viewers of the ad. It does this by assigning weights (or probabilities) based on the click history of each ad copy. At first all possible ad copies are weighted the same, they are all equally likely to produce a click. Over time, as people tend to click one version of ad text over others, the weights change, so the ad text getting the most clicks eventually displays more frequently. This appr