tim: "System Status: Degraded" (degraded)
[CW: suicide]

Elizabeth Waite was a trans woman who committed suicide last week. I did not know Elizabeth, but several of my friends did. In an article for the Daily Beast, Ben Collins described what happened after she died (CW if you follow the link to the article: it quotes extremely transmisogynistic and violent comments and images, including some that incite suicide.)

The night the article describes, I sat in my office after work with Elizabeth's profile open in a tab, watching the stream of hateful comments pour in almost faster than I could report them to Facebook. My friends had mentioned that members of an online forum known for terrorizing autistic trans women were flooding her profile (particularly her last post, in which she stated her intention to commit suicide) with hateful comments. Since I didn't know Elizabeth and wasn't emotionally affected by reading these comments in the same way that I would have been if I had known her, I felt that bearing witnesses and reporting the comments as abuse was work that I could usefully do. Since many of the comments were obviously from fake accounts, and Facebook is well-known for its desire for good data (read: monetizable data), specifically accounts attached to the names people use in everyday life, I reported those accounts as fake as well.

And later that night, I watched dozens and dozens of emails fill my inbox that were automated responses from Facebook's abuse reporting system. Most of the responses said this:

Thank you for taking the time to report something that you feel may violate our Community Standards. Reports like yours are an important part of making Facebook a safe and welcoming environment. We reviewed the comment you reported for displaying hate speech and found it doesn't violate our Community Standards.
Please let us know if you see anything else that concerns you. We want to keep Facebook safe and welcoming for everyone.

screenshot of the quoted text

Because the posts in question were eventually made private, I can't quote the comments about which a Facebook content reviewer said "it doesn't violate our Community Standards", and in fairness to the person or people reviewing the comments, some of the comments weren't obviously hate speech without the context that they were in a thread of people piling on a dead trans woman. Facebook lacks a way to report abuse that goes beyond "the text of this individual comment, in the absence of context, violates Facebook's Community Standards." That's part of the problem. If trans people were in positions of power at Facebook, you can bet that there would be a "report transmisogynist hate mob" button that would call attention to an entire thread in which an individual was being targeted by a coordinated harassment campaign.

Likewise, even though Facebook is notorious for harassing trans people for using the names we use in everyday life as our account names, when I reported an account with the name "Donny J. Trump" for impersonation, I got an email back saying that the account would not be suspended because it wasn't impersonating anybody:

screenshot of the aforementioned text

Facebook's tools don't address this problem. Imagine you're the family member of a trans woman who just died and whose profile is receiving a flood of hateful comments. Dozens of users are posting these comments -- too many to block, and anyway, what good would blocking do if you don't have access to the deceased person's account password? The comments would still be there, defacing that person's memory. Reporting individual comments has no effect if the harassment is conducted by posting a series of memes that aren't necessarily offensive on their own, but have the effect of demeaning and belittling a person's death when posted as comments in response to a suicide note. And getting an account converted to a "memorial account" -- which allows someone else to administer it -- can take days, which doesn't help when the harassment is happening right now. Again: you can look at Facebook and know that it's a company in which the voices of people who worry about questions like, "when I die, will people on an Internet forum organize a hate mob to post harmful comments all over my public posts?" are not represented.

But Facebook doesn't even do what they promise to do: delete individual comments that clearly violate their community standards:

Facebook removes hate speech, which includes content that directly attacks people based on their:

National origin,
Religious affiliation,
Sexual orientation,
Sex, gender, or gender identity, or
Serious disabilities or diseases.

Out of the many comments in the threads on Elizabeth Waite's profile that clearly attacked people based on their gender identity or disability, most were deemed by Facebook as "doesn't violate our Community Standards."

At this point, Facebook ought to just stop pretending to have an abuse reporting system, because what they promise to do has nothing to do with what they will actually do. Facebook's customers are advertisers -- people like you and me who produce content that helps Facebook deliver an audience for advertisers (you might think of us as "users") are the raw material, not the customers. Even so, it's strange that companies that pay for advertising on Facebook don't care that Facebook actively enables this kind of harassment.

If you read the Daily Beast article, you'll also notice that Facebook was completely unhelpful and unwilling to stop the abuse other than in a comment-by-comment way until one of the family members found a laptop that still had a login cookie for Elizabeth's account -- they wouldn't memorialize it or do anything else to stop the abuse wholesale in a timely fashion. What would have happened if the cookie had already expired?

Like anybody else, trans people die for all kinds of reasons. In an environment where hate speech is being encouraged from the highest levels of power, this is just going to keep happening more and more. Facebook will continue to refuse to do anything to stop it, because hate speech doesn't curtail their advertising revenue. In fact, as I wrote about in "The Democratization of Defamation", the economic incentives that exist encourage companies like Facebook to potentiate harassment, because more harassment means more impressions.

Although it's clearly crude economics that make Facebook unwilling to invest resources in abuse prevention, a public relations person at Facebook would probably tell you that they are reluctant to remove hate speech because of concern for free speech. Facebook is not a common carrier and has no legal (or moral) obligation to spend money to disseminate content that isn't consistent with its values as a business. Nevertheless, think about this for a moment: in your lifetime, you will probably have to see a loved one's profile get defaced like this and know that Facebook will do nothing about it. Imagine a graveyard that let people spray paint on tombstones and then stopped you from washing the paint off because of free speech.

What responsibilities do social media companies -- large ones like Facebook that operate as completely unregulated public utilities -- have to their users? If you'd like, you can call Facebook's billions of account holders "content creators"; what responsibilities do they have to those of us who create the content that Facebook uses for delivering an audience to advertisers?

Facebook would like you to think that they give us access to their site for free because they're nice people and like us, but corporations aren't nice people and don't like you. The other viewpoint you may have heard is: "If you're not paying for the product, then you are the product." Both of these stories are too simplistic. If you use Facebook, you do pay for it: with the labor you put into writing status updates and comments (without your labor, Facebook would have nothing to sell to advertisers) and with the attention you give to ads (even if you never click on an ad).

If you're using something that's being given away for free, then the person giving it away has no contractual obligations to you. Likewise, if you are raw material, than the people turning you into gold have no contractual obligations to you. But if you're paying to use Facebook -- and you are, with your attention -- that creates a buyer/seller relationship. Because this relationship is not formalized, you as the buyer assume all the risks in the transaction while the seller reaps all of the economic benefit.

Do you like this post? Support me on Patreon and help me write more like it. In December 2016, I'll be donating all of my Patreon earnings to the National Network of Abortion Funds, so if you'd like to show your support, you can also make a one-time or recurring donation to them directly.

tim: "System Status: Degraded" (degraded)
This post is the last in a 4-part series. The first three parts were "Defame and Blame", "Phone Books and Megaphones," and "Server-Side Economics."

Harassment as Externality

In part 3, I argued that online harassment is not an accident: it's something that service providers enable because it's profitable for them to let it happen. To know how to change that, we have to follow the money. There will be no reason to stop abuse online as long as advertisers are the customers of the services we rely on. To enter into a contract with a service you use and expect that the service provider will uphold their end of it, you have to be their customer, not their product. As their product, you have no more standing to enter into such a contract than do the underground cables that transmit content.

Harassment, then, is good for business -- at least as long as advertisers are customers and end users are raw material. If we want to change that, we'll need a radical change to the business models of most Internet companies, not shallow policy changes.

Deceptive Advertising

Why is false advertising something we broadly disapprove of -- something that's, in fact, illegal -- but spreading false information in order to entice more eyeballs to view advertisements isn't? Why is it illegal to run a TV ad that says "This toy will run without electricity or batteries," but not illegal for a social media site to surface the message, "Alice is a slut, and while we've got your attention, buy this toy?" In either case, it's lying in order to sell something.

Advertising will affect decision-making by Internet companies as long as advertising continues to be their primary revenue source. If you don't believe in the Easter Bunny, you shouldn't believe it either when executives tell you that ad money is a big bag of cash that Santa Claus delivers with no strings attached. Advertising incentivize ad-funded media to do whatever gets the most attention, regardless of truth. The choice to do what gets the most attention has ethical and political significance, because achieving that goal comes at the expense of other values.

Should spreading false information have a cost? Should dumping toxic waste have a cost? They both cost money and time to clean up. CDA 230 protects sites that profit from user-generated content from liability from paying any of the costs of that content, and maybe it's time to rethink that. A search engine is not like a common carrier -- one of the differences is that it allows one-to-many communication. There's a difference between building a phone system that any one person can use to call anyone else, and setting up an autodialer that lets the lucky 5th callee record a new message for it.

Accountability and Excuses

"Code is never neutral; it can inhibit and enhance certain kinds of speech over others. Where code fails, moderation has to step in."
-- Sarah Jeong, The Internet of Garbage
Have you ever gone to the DMV or called your health insurance company and been told "The computer is down" when, you suspected, the computer was working fine and it just wasn't in somebody's interest to help you right now? "It's just an algorithm" is "the computer is down," writ large. It's a great excuse for failure to do the work of making sure your tools don't reproduce the same oppressive patterns that characterize the underlying society in which those tools were built. And they will reproduce those patterns as long as you don't actively do the work of making sure they don't. Defamation and harassment disproportionately affect the most marginalized people, because those are exactly the people that you can bully with few or no consequences. Make it easier to harass people, to spread lies about them, and you are making it easier for people to perpetuate sexism and racism.

There are a number of tools that technical workers can use to help mitigate the tendency of the communities and the tools that they build to reproduce social inequality present in the world. Codes of conduct are one tool for reducing the tendency of subcultures to reproduce inequality that exists in their parent culture. For algorithms, human oversight could do the same -- people could regularly review search engine results in a way that includes verifying factual claims that are likely to have a negative impact on a person's life if the claims aren't true. It's also possible to imagine designing heuristics that address the credibility of a source rather than just its popularity. But all of this requires work, and it's not going to happen unless tech companies have an incentive to do that work.

A service-level agreement (SLA) is a contract between the provider and a service and the services' users that outlines what the users are entitled to expect from the service in exchange for their payment. Because people pay for most Web services with their attention (to ads) rather than with money, we don't usually think about SLAs for information quality. For an SLA to work, we would probably have to shift from an ad-based model to a subscription-based model for more services. We can measure how much money you spend on a service -- we can't measure how much attention you provide to its advertisers. So attention is a shaky basis on which to found a contract. Assuming business models where users pay in a more direct and transparent way for the services they consume, could we have SLAs for factual accuracy? Could we have an SLA for how many death threats or rape threats it's acceptable for a service to transmit?

I want to emphasize one more time that this article isn't about public shaming. The conversation that uses the words "public shaming" is about priorities, rather than truth. Some people want to be able to say what they feel like saying and get upset when others challenge them on it rather than politely ignoring it. When I talk about victims of defamation, that's not who I'm talking about -- I'm talking about people against whom attackers have weaponized online media in order to spread outright lies about them.

People who operate search engines already have search quality metrics. Could one of them be truth -- especially when it comes to queries that impinge on actual humans' reputations? Wikipedia has learned this lesson: its policy on biographies of living persons (BLP) didn't exist from the site's inception, but arose as a result of a series of cases in which people acting in bad faith used Wikipedia to libel people they didn't like. Wikipedia learned that if you let anybody edit an article, there are legal risks; the risks were (and continue to be) especially real for Wikipedia due to how highly many search engines rank it. To some extent, content providers have been able to protect themselves from those risks using CDA 230, but sitting back while people use your site to commit libel is still a bad look... at least if the targets are famous enough for anyone to care about them.

Code is Law

Making the Internet more accountable matters because, in the words of Lawrence Lessig, code is law. Increasingly, software automates decisions that affect our lives. Imagine if you had to obey laws, but weren't allowed to read their text. That's the situation we're in with code.

We recognize that the passenger in a hypothetical self-driving car programmed to run over anything in its path has made a choice: they turned the key to start the machine, even if from then on, they delegated responsibility to an algorithm. We correctly recognize the need for legal liability in this situation: otherwise, you could circumvent laws against murder by writing a program to commit murder instead of doing it yourself. Somehow, when physical objects are involved it's easier to understand that the person who turns the key, who deploys the code, has responsibility. It stops being "just the Internet" when the algorithms you designed and deployed start to determine what someone's potential employers think of them, regardless of truth.

There are no neutral algorithms. An algorithmic blank slate will inevitably reproduce the violence of the social structures in which it is embedded. Software designers have the choice of trying to design counterbalances to structural violence into their code, or to build tools that will amplify structural violence and inequality. There is no neutral choice; all technology is political. People who say they're apolitical just mean their political interests align well with the status quo.

Recommendation engines like YouTube, or any other search engine with relevance metrics and/or a recommendation system, just recognize patterns -- right? They don't create sexism; if they recommend sexist videos to people who aren't explicitly searching for them, that's because sexist videos are popular, right? YouTube isn't to blame for sexism, right?

Well... not exactly. An algorithm that recognizes patterns will recognize oppressive patterns, like the determination that some people have to silence women, discredit them, and pollute their agencies. Not only will it recognize those patterns, it will reproduce those patterns by helping people who want to silence women spread their message, which has a self-reinforcing effect: the more the algorithm recommends the content, the more people will view it, which reinforces the original recommendation. As Sarah Jeong wrote in The Internet of Garbage, "The Internet is presently siloed off into several major public platforms" -- public platforms that are privately owned. The people who own each silo own so many computing resources that competing with them would be infeasible for all but a very few -- thus, the free market will never solve this problem.

Companies like Google say they don't want to "be evil", but intending to "not be evil" is not enough. Google has an enormous amount of power, and little to no accountability -- no one who manages this public resource was elected democratically. There's no process for checking the power they have to neglect and ignore the ways in which their software participates in reproducing inequality. This happened by accident: a public good (the tools that make the Internet a useful source of knowledge) has fallen under private control. This would be a good time for breaking up a monopoly.

Persistent Identities

In the absence of anti-monopoly enforcement, is there anything we can do? I think there is. Anil Dash has written about persistent pseudonyms, a way to make it possible to communicate anonymously online while still standing to lose something of value if you abuse that privilege in order to spread false information. The Web site Metafilter charges a small amount of money to create an account, in order to discourage sockpuppeting (the practice of responding to being banned from a Web site by coming back to create a new account) -- it turns out this approach is very effective, since people who are engaging in harassment for laughs don't seem to value their own laughs very highly in terms of money.

I think advertising-based funding is also behind the reason why more sites don't implement persistent pseudonyms. The advertising-based business model encourages service providers to make it easy as possible for people to use their service; requiring the creation of an identity would put an obstacle in the way of immediate engagement. This is good from the perspective of nurturing quality content, but bad from the perspective that it limits the number of eyeballs that will be focused on ads. And thus, we see another way in which advertising enables harassment.

Again, this isn't a treatise against anonymity. None of what I'm saying implies you can't have 16 different identities for all the communities you participate in online. I am saying that I want it to be harder for you to use one of those identities for defamation without facing consequences.

A note on diversity

Twitter, Facebook, Google, and other social media and search companies are notoriously homogeneous, at least when it comes to their engineering staff and their executives, along gendered and racial lines. But what's funny is that Twitter, Facebook, and other sites that make money by using user-generated content to attract an audience for advertisements, are happy to use the free labor that a diversity of people do for them when they create content (that is, write tweets or status updates). The leaders of these companies recognize that they couldn't possibly hire a collection of writers who would generate better content than the masses do -- and anyway, even if they could, writers usually want to be paid. So they recognize the value of diversity and are happy to reap its benefits. They're not so enthusiastic to hire a diverse range of people, since that would mean sharing profits with people who aren't like themselves.

And so here's a reason why diversity means something. People who build complex information systems based on approximations and heuristics have failed to incorporate credibility into their designs. Almost uniformly, they design algorithms that will promote whatever content gets the most attention, regardless of its accuracy. Why would they do otherwise? Telling the truth doesn't attract an audience for advertisers. On the other hand, there is a limit to how much harm an online service can do before the people whose attention they're trying to sell -- their users -- get annoyed and start to leave. We're seeing that happen with Twitter already. If Twitter's engineers and product designers had included more people in demographics that are vulnerable to attacks on their credibility (starting with women, non-binary people, and men of color), then they'd have a more sustainable business, even if it would be less profitable in the short term. Excluding people on the basis of race and gender hurts everyone: it results in technical decisions that cause demonstrable harm, as well as alienating people who might otherwise keep using a service and keep providing attention to sell to advertisers.

Internalizing the Externalities

In the same way that companies that pollute the environment profit by externalizing the costs of their actions (they get to enjoy all the profit, but the external world -- the government and taxpayers -- get saddled with the responsibility of cleaning up the mess), Internet companies get to profit by externalizing the cost of transmitting bad-faith speech. Their profits are higher because no one expects them to spend time incorporating human oversight into pattern recognition. The people who actually generate bad-faith speech get to externalize the costs of their speech as well. It's the victims who pay.

We can't stop people from harassing or abusing others, or from lying. But we can make it harder for them to do it consequence-free. Let's not let the perfect be the enemy of the good. Analogously, codes of conduct don't prevent bad actions -- rather, they give people assurance that justice will be done and harmful actions will have consequences. Creating a link between actions and consequences is what justice is about; it's not about creating dark corners and looking the other way as bullies arrive to beat people up in those corners.

...the unique force-multiplying effects of the Internet are underestimated. There’s a difference between info buried in small font in a dense book of which only a few thousand copies exist in a relatively small geographic location versus blasting this data out online where anyone with a net connection anywhere in the world can access it.
-- Katherine Cross, "'Things Have Happened In The Past Week': On Doxing, Swatting, And 8chan":
When we protect content providers from liability for the content that they have this force-multiplying effect on, our priorities are misplaced. With power comes responsibility; currently, content providers have enormous power to boost some signals while dampening others, and the fact that these decisions are often automated and always motivated by profit rather than pure ideology doesn't reduce the need to balance that power with accountability.
"The technical architecture of online platforms... should be designed to dampen harassing behavior, while shielding targets from harassing content. It means creating technical friction in orchestrating a sustained campaign on a platform, or engaging in sustained hounding."
-- Sarah Jeong, The Internet of Garbage
That our existing platforms neither dampen nor shield isn't an accident -- dampening harassing behavior would limit the audience for the advertisements that can be attached to the products of that harassing behavior. Indeed, they don't just fail to dampen, they do the opposite: they amplify the signals of harassment. At the point where an algorithm starts to give a pattern a life of its own -- starts to strengthen a signal rather than merely repeating it -- it's time to assign more responsibility to companies that trade in user-generated content than we traditionally have. To build a recommendation system that suggests particular videos are worth watching is different from building a database that lets people upload videos and hand URLs for those videos off to their friends. Recommendation systems, automated or not, create value judgments. And the value judgments they surface have an irrevocable effect on the world. Helping content get more eyeballs is an active process, whether or not it's implemented by algorithms people see as passive.

There is no hope of addressing the problem of harassment as long as it continues to be an externality for the businesses that profit from enabling it. Whether by supporting subscription-based services with our money and declining to give our attention to advertising-based surfaces, or expanding legal liability for the signals that a service selectively amplifies, or by normalizing the use of persistent pseudonyms, people will continue to have their lives limited by Internet defamation campaigns as long as media companies can profit from such campaigns without paying their costs.

Do you like this post? Support me on Patreon and help me write more like it.

tim: "System Status: Degraded" (degraded)
This post is the third in a 4-part series. The first two parts were "Defame and Blame" and "Phone Books and Megaphones.". The last part is "Harassment as Externality"

Server-Side Economics

In "Phone Books and Megaphones", I talked about easy access to the megaphone. We can't just blame the people who eagerly pick up the megaphone when it's offered for the content of their speech -- we also have to look at the people who own the megaphone, and why they're so eager to lend it out.

It's not an accident that Internet companies are loathe to regulate harassment and defamation. There are economic incentives for the owners of communication channels to disseminate defamation: they make money from doing it, and don't lose money or credibility in the process. There are few incentives for the owners of these channels to maintain their reputations by fact-checking the information they distribute.

I see three major reasons why it's so easy for false information to spread:

  • Economic incentives to distribute any information that gets attention, regardless of its truth.
  • The public's learned helplessness in the face of software, which makes it easy for service owners to claim there's nothing they can do about defamation. By treating the algorithms they themselves implemented as black boxes, their designers can disclaim responsibility for the actions of the machines they set into motion.
  • Algorithmic opacity, which keeps the public uninformed about how code works and makes it more likely they'll believe that it's "the computers fault" and people can't change anything.

Incentives and Trade-Offs

Consider email spam as a cautionary tale. Spam and abuse are both economic problems. The problem of spam arose because the person who sends an email doesn't pay the cost of transmitting it to the recipient. This creates an incentive to use other people's resources to advertise your product for free. Likewise, harassers can spam the noosphere with lies, as they continue to do in the context of GamerGate, and never pay the cost of their mendacity. Even if your lies get exposed, they won't be billed to your reputation -- not if you're using a disposable identity, or if you're delegating the work to a crowd of people using disposable identities (proxy recruitment). The latter is similar to how spammers use botnets to get computers around the world to send spam for them, usually unbeknownst to the computers' owners -- except rather than using viral code to co-opt a machine into a botnet, harassers use viral ideas to recruit proxies.

In The Internet of Garbage, Sarah Jeong discusses the parallels between spam and abuse at length. She asks why the massive engineering effort that's been put towards curbing spam -- mostly successfully, at least in the sense of saving users from the time it takes to manually filter spam (Internet service providers still pay the high cost of transmitting it, only to be filtered out at the client side) -- hasn't been applied to the abuse problem. I think the reason is pretty simple: spam costs money, but abuse makes money. By definition, almost nobody wants to see spam (a tiny percentage of people do, which is why it's still rewarding for spammers to try). But lots of people want to see provocative rumors, especially when those rumors reinforce their sexist or racist biases. In "Trouble at the Koolaid Point", Kathy Sierra wrote about the incentives for men to harass women online: a belief that any woman who gets attention for her work must not deserve it, must have tricked people into believing her work has value. This doesn't create an economic incentive for harassment, but it does create an incentive -- meanwhile, if you get more traffic to your site and more advertising money because someone's using it to spread GamerGate-style lies, you're not going to complain. Unless you follow a strong ethical code, of course, but tech people generally don't. Putting ethics ahead of profit would betray your investors, or your shareholders.

If harassment succeeds because there's an economic incentive to let it pass through your network, we have to fight it economically as well. Moralizing about why you shouldn't let your platform enable harassment won't help, since the platform owners have no shame.

Creating these incentives matters. Currently, there's a world-writeable database with everyone's names as the keys, with no accounting and no authentication. A few people control it and a few people get the profits. We shrug our shoulders and say "how can we trace the person who injected this piece of false information into the system? There's no way to track people down." But somebody made the decision to build a system in which people can speak with no incentive to be truthful. Alternative designs are possible.

Autonomous Cars, Autonomous Code

Another reason why there's so little economic incentive to control libel is that the public has a sort of learned helplessness about algorithms... at least when it's "just" information that those algorithms manipulate. We wouldn't ask why a search engine returns the top results that it returns for a particular query (unless we study information retrieval), because we assume that algorithms are objective and neutral, that they don't reproduce the biases of the humans who built them.

In part 2, I talked about why "it's just an algorithm" isn't a valid answer to questions about the design choices that underlie algorithms. We recognize this better for algorithms that aren't purely about producing and consuming information. We recognize that despite being controlled by algorithms, self-driving cars have consequences for legal liability. It's easy to empathize with the threat that cars pose to our lives, and we're correctly disturbed by the idea that you or someone you love could be harmed or killed by a robot who can't be held accountable for it. Of course, we know that the people who designed those machines can be held accountable if they create software that accidentally harms people through bugs, or deliberately harms people by design.

Imagine a self-driving car designer who programmed the machines to act in bad faith: for example, to take risks to get the car's passenger to their destination sooner at the potential expense of other people on the road. You wouldn't say "it's just an algorithm, right?" Now, what if people died due to unforeseen consequences of how self-driving car designers wrote their software rather than deliberate malice? You still wouldn't say, "It's just an algorithm, right?" You would hold the software designers liable for their failure to test their work adequately. Clearly, the reason why you would react the same way in the good-faith scenario as in the bad-faith one is the effect of the poor decision, rather than whether the intent was malicious or less careless.

Algorithms that are as autonomous as self-driving cars, and perhaps less transparent, control your reputation. Unlike with self-driving cars, no one is talking about liability for what happens when they turn your reputation into a pile of burning wreckage.

Algorithms are also incredibly flexible and changeable. Changing code requires people to think and to have discussions with each other, but it doesn't require much attention to the laws of physics and other than paying humans for their time, it has little cost. Exploiting the majority's lack of familiarity with code in order to act as if having to modify software is a huge burden is a good way to avoid work, but a bad way to tend the garden of knowledge.

Plausible Deniability

Designers and implementors of information retrieval algorithms, then, enjoy a certain degree of plausible deniability that designers of algorithms to control self-driving cars (or robots or trains or medical devices) do not.

During the AmazonFail incident in which an (apparent) bug in Amazon's search software caused books on GLBT-related topics to be miscategorized as "adult" and hidden from searches, defenders of Amazon cried "It's just an algorithm." The algorithm didn't hate queer people, they said. It wasn't out to get you. It was just a computer doing what it had programmed to do. You can't hold a computer responsible.

"It's just an algorithm" is the natural successor to the magical intent theory of communication. Since your intent cannot be known to someone else (unless you tell them -- but then, you could lie about it), citing your good intent is often an effective way to dodge responsibility for bad actions. Delegating actions to algorithms takes the person out of the picture altogether: if people with power delegate all of their actions to inanimate objects, which lack intentionality, then no one (no one who has power, anyway) has to be responsible for anything.

"It's just an algorithm" is also a shaming mechanism, because it implies that the complainer is naïve enough to think that computers are conscious. But nobody thinks algorithms can be malicious. So saying, "it's just an algorithm, it doesn't mean you harm" is a response to something nobody said. Rather, when we complain about the outcomes of algorithms, we complain about a choice that was made by not making a choice. In the context of this article, it's the choice to not design systems with an eye towards their potential use for harassment and defamation and possible ways to mitigate those risks. People make this decision all the time, over and over, including for systems being designed today -- when there's enough past experience that everybody ought to know better.

Plausible deniability matters because it provides the moral escape hatch from responsibility for defamation campaigns, on the part of people who own search engines and social media sites. (There's also a legal escape hatch from responsibility, at least in the US: CDA Section 230, which shields every "provider or user of an interactive computer service" from liability for "any information provided by another information content provider.") Plausible deniability is the escape hatch, and advertising is the economic incentive to use that escape hatch. Combined with algorithm opacity, they create a powerful set of incentives for online service providers to profit from defamation campaigns. Anything that attracts attention to a Web site (and, therefore, to the advertisements on it) is worth boosting. Since there are no penalties for boosting harmful, false information, search and recommendation algorithms are amplifiers of false information by design -- there was never any reason to design them not to elevate false but provocative content.


I've shown that information retrieval algorithms tend to be bad at limiting the spread of false information because doing the work to curb defamation can't be easily monetized, and because people have low expectations for software and don't hold its creators responsible for their actions. A third reason is that the lack of visibility of the internals of large systems has a chilling effect on public criticism of them.

Plausible deniability and algorithmic opacity go hand in hand. In "Why Algorithm Transparency is Vital to the Future of Thinking", Rachel Shadoan explains in detail what it means for algorithms to be transparent or opaque. The information retrieval algorithms I've been talking about are opaque. Indeed, we're so used to centralized control of search engines and databases that it's hard for them to imagine them being otherwise.

"In the current internet ecosystem, we–the users–are not customers. We are product, packaged and sold to advertisers for the benefit of shareholders. This, in combination with the opacity of the algorithms that facilitate these services, creates an incentive structure where our ability to access information can easily fall prey to a company’s desire for profit."
-- Rachel Shadoan
In an interview, Chelsea Manning commented on this problem as well:
"Algorithms are used to try and find connections among the incomprehensible 'big data' pools that we now gather regularly. Like a scalpel, they're supposed to slice through the data and surgically extract an answer or a prediction to a very narrow question of our choosing—such as which neighborhood to put more police resources into, where terrorists are likely to be hiding, or which potential loan recipients are most likely to default. But—and we often forget this—these algorithms are limited to determining the likelihood or chance based on a correlation, and are not a foregone conclusion. They are also based on the biases created by the algorithm's developer....

These algorithms are even more dangerous when they happen to be proprietary 'black boxes.' This means they cannot be examined by the public. Flaws in algorithms, concerning criminal justice, voting, or military and intelligence, can drastically affect huge populations in our society. Yet, since they are not made open to the public, we often have no idea whether or not they are behaving fairly, and not creating unintended consequences—let alone deliberate and malicious consequences."
-- Chelsea Manning, BoingBoing interview by Cory Doctorow

Opacity results from the ownership of search technology by a few private companies, and their desire not to share their intellectual property. If users were the customers of companies like Google, there would be more of an incentive to design algorithms that use heuristics to detect false information that damages people's credibility. Because advertisers are the customers, and because defamation generally doesn't affect advertisers negatively (unless the advertiser itself is being defamed), there is no economic incentive to do this work. And because people don't understand how algorithms work, and couldn't understand any of the search engines they used even if they wanted to (since the code is closed-source), it's much easier for them to accept the spread of false information as an inevitable consequence of technological progress.

Manning's comments, especially, show why the three problems of economic incentives, plausible deniability, and opacity are interconnected. Economics give Internet companies a reason to distribute false information. Plausible deniability means that the people who own those companies can dodge any blame or shame by assigning fault to the algorithms. And opacity means nobody can ask for the people who design and implement the algorithms to do better, because you can't critique the algorithm if you can't see the source code in the first place.

It doesn't have to be this way. In part 4, I'll suggest a few possibilities for making the Internet a more trustworthy, accountable, and humane medium.

To be continued.

Do you like this post? Support me on Patreon and help me write more like it.

tim: "System Status: Degraded" (degraded)
This post is the second in a 4-part series. The first part was "Defame and Blame". The next part is "Server-Side Economics."

Phone Books and Megaphones

Think back to 1986. Imagine if somebody told you: "In 30 years, a public directory that's more accessible and ubiquitous than the phone book is now will be available to almost everybody at all times. This directory won't just contain your contact information, but also, a page anyone can write on, like a middle-school slam book but meaner. Whenever anybody writes on it, everybody else will be able to see what they wrote." I don't thin you would have believed it, or if you found it plausible, you probably wouldn't have found this state of affairs acceptable. Yet in 2016, that's how things are. Search engine results have an enormous effect on what people believe to be true, and anybody with enough time on their hands can manipulate search results.

Antisocial Network Effects

When you search for my name on your favorite search engine, you'll find some results that I wish weren't closely linked to my name. People who I'd prefer not to think about have written blog posts mentioning my name, and those articles are among the results that most search engines will retrieve if you're looking for texts that mention me. But that pales in comparison with the experiences of many women A few years ago, Skud wrote:

"Have you ever had to show your male colleagues a webpage that calls you a fat dyke slut? I don’t recommend it."

Imagine going a step further: have you ever had to apply for jobs knowing that if your potential manager searches for your name online, one of the first hits will be a page calling you a fat dyke slut? In 2016, it's pretty easy for anybody who wants to to make that happen to somebody else, as long as the target isn't unusually wealthy or connected. Not every potential manager is going to judge someone negatively just because someone called that person a fat dyke slut on the Internet, and in fact, some might judge them positively. But that's not the point -- the point is if you end up in the sights of a distributed harassment campaign, then one of the first things your potential employers will know about you, possibly for the rest of your life, might be that somebody called you a fat dyke slut. I think most of us, if we had the choice, wouldn't choose that outcome.

Suppose the accusation isn't merely a string of generic insults, but something more tangible: suppose someone decides to accuse you of having achieved your professional position through "sleeping your way to the top," rather than merit. This is a very effective attack on a woman's credibility and competence, because patriarchy primes us to be suspicious of women's achievements anyway. It doesn't take much to tip people, even those who don't consciously hold biases against women, into believing these attacks, because we hold unconscious biases against women that are much stronger than anyone's conscious bias. It doesn't matter if the accusation is demonstrably false -- so long as somebody is able to say it enough times, the combination of network effects and unconscious bias will do the rest of the work and will give the rumor a life of its own.

Not every reputation system has to work the way that search engines do. On eBay, you can only leave feedback for somebody else if you've sold them something or bought something from them. In the 17 years since I started using eBay, that system has been very effective. Once somebody accumulates social capital in the form of positive feedback, they generally don't squander that capital. The system works because having a good reputation on eBay has value, in the financial sense. If you lose your reputation (by ripping somebody off), it takes time to regain it.

On the broader Internet, you can use a disposable identity to generate content. Unlike on eBay, there is no particular reason to use a consistent identity in order to build up a good track record as a seller. If your goal is to build a #personal #brand, then you certainly have a reason to use the same name everywhere, but if your goal is to destroy someone else's, you don't need to do that. The ready availability of disposable identities ("sockpuppets") means that defaming somebody is a low-risk activity even if your accusations can be demonstrated false, because by the time somebody figures out you made your shit up, you've moved on to using a new name that isn't sullied by a track record of dishonesty. So there's an asymmetry here: you can create as many identities as you want, for no cost, to destroy someone else's good name, but having a job and functioning in the world makes it difficult to change identities constantly.

The Megaphone

For most of the 20th century, mass media consisted of newspapers, then radio and then TV. Anybody could start a newspaper, but radio and TV used the broadcast spectrum, which is a public and scarce resource and thus is regulated by governmental agencies. Because the number of radio and TV channels was limited, telecommunications policy was founded on the assumption that some amount of regulation of these channels' use was necessary and did not pose an intrinsic threat to free speech. The right to use various parts of the broadcast spectrum was auctioned off to various private companies, but this was a limited-scope right that could be revoked if those companies acted in a way that blatantly contravened the public interest. A consistent pattern of deception would have been one thing that went against the public interest. As far as I know, no radio or TV broadcaster ever embarked upon a deliberate campaign of defaming multiple people, because the rewards of such an activity wouldn't offset the financial losses that would be inevitably incurred when the lies were exposed.

(I'll use "the megaphone" as a shorthand for media that are capable of reaching a lot of people: formerly, radio and broadcast TV; then cable TV; and currently, the Internet. Not just "the Internet", though, but rather: Internet credibility. Access to the credible Internet (the content that search engine relevance algorithms determine should be centered in responses to queries) is gatekept by algorithms; access to old media was gatekept by people.)

At least until the advent of cable TV, then, the broader the reach of a given communication channel, the more closely access to that channel was monitored and regulated. It's not that this system always worked perfectly, because it didn't, just that there was more or less consensus that it was correct for the public to have oversight with respect to who could be entrusted with access to the megaphone.

Now that access to the Internet is widespread, the megaphone is no longer a scarce resource. In a lot of ways, that's a good thing. It has allowed people to speak truth to power and made it easier for people in marginalized groups to find each other. But it also means that it's easy to start a hate campaign based on falsehoods without incurring any personal risk.

I'm not arguing against anonymity here. Clearly, at least some people have total freedom to act in bad faith while using the names they're usually known by: Milo Yiannopoulos and Andrew Breitbart are obvious examples. If use of real names deters harassment, why are they two of the best-known names in harassment?

Algorithm as Excuse

Zoë Quinn pointed out on Twitter that she can no longer share content with her friends, even if she limits access to it, because her name is irrevocably linked to the harassment campaign that her ex-boyfriend started in order to defame her in 2014, otherwise known as GamerGate. If she uses YouTube to share videos, its recommendation engine will suggest to her friends that they watch "related" videos that -- at best -- attack her for her gender and participation in the game development community. There is no individual who works for Google (YouTube's parent company) who made an explicit decision to link Quinn's name with these attacks. Nonetheless, a pattern in YouTube's recommendations emerged because of a concerted effort by a small group of dedicated individuals to pollute the noosphere in order to harm Quinn. If you find this outcome unacceptable, and I do, we have to consider the chain of events that led to it and ask which links in the chain could be changed so this doesn't happen to someone else in the future.

There is a common line of response to this kind of problem: "You can't get mad at algorithms. They're objective and unbiased." Often, the implication is that the person complaining about the problem is expecting computers to be able to behave sentiently. But that's not the point. When we critique an algorithm's outcome, we're asking the people who design and maintain the algorithms to do better, whether the outcome is that it uses too much memory or that it causes a woman to be re-victimized every time someone queries a search engine for her name. Everything an algorithm does is because of a design choice that one or several humans made. And software exists to serve humans, not the other way around: when it doesn't do what we want, we can demand change, rather than changing ourselves so that software developers don't have to do their jobs. By saying "it's just an algorithm", we can avoid taking responsibility for our values as long as we encode those values as a set of rules executable by machine. We can automate disavowal.

How did we get here -- to a place where anyone can grab the megaphone, anyone can scribble in the phone book, and people who benefit from the dissemination of this false information are immune from any of the risks? I'll try to answer that in part 3.

To be continued.

Do you like this post? Support me on Patreon and help me write more like it.

tim: "System Status: Degraded" (degraded)
This post is the first in a 4-part series. Part 2 is "Phone Books and Megaphones."

Defame and Blame

The Internet makes it cheap to damage someone else's reputation without risking your own. The asymmetry between the low cost of spreading false information and the high cost to victims of such attacks is an economic and architectural failure, an unintended consequence of a communications infrastructure that's nominally decentralized while actually largely centralized under the control of a few advertising-based companies.

We do not hear a lot of discussion of harassment and defamation as either an economic failure or an engineering failure. Instead, we hear that online harassment is sad but inevitable, or that it happens "because people suck." As Anil Dash wrote, "don't read the comments" normalizes the expectation that behavior online will sink to the lowest common denominator and stay there. People seem to take a similar approach to outright harassment as they do to comments expressing bad opinions.

The cases I'm talking about, like the defamation of Kathy Sierra or the Gamergate coordinated harassment campaign, are effective because of their use of proxy recruitment. Effective propagandists who have social capital have learned how to recruit participants for their harassment campaigns: by coming up with a good lie and relying on network effects to do the rest of the work. Spreading false information about a woman -- particularly a woman who is especially vulnerable because of intersecting marginalized identities -- is easy because it confirms sexist biases (conscious or not-so-conscious) that we all have. Since most of us have internalized the belief that women are less competent, convincing people that a woman slept her way to the top doesn't take much effort.

"Don't read the comments" isn't good advice for people who are merely being pestered. (And anyway, we might question the use of the word "merely", since having to manage a flood of unwanted comments in order to receive desired communication tends to have a certain isolating effect on a person.) But it's especially bad advice for people who are being defamed. What good does it do to ignore the people spreading lies about you when ignoring them won't change what set of web pages a search engine returns as the top ten hits for your name? When you advise targets of harassment to "ignore it" or to "not feed the trolls", you shift responsibility onto victims and away from the people who benefit from the spread of false information (and I don't just mean the people who initiate harassment campaigns). In short, you blame victims.

Algorithms, Advertising, and Accountability

We neither talk much about the democratization of defamation, nor know how to mitigate it. It happens for a reason. Online harassment and defamation campaigns are an inevitable consequence of a telecommunications infrastructure that is dominated by for-profit advertising-supported businesses governed by algorithms that function autonomously. However, neither the autonomy of algorithms nor the ad-supported business model that most social media and search engine companies share is inevitable. Both are a result of decisions made by people, and both can be changed if people have the will to do so. The combination of ads and unsupervised algorithms currently defines the political economy of telecommunications, but it's by no means inevitable, natural, or necessary.

Broadcast television is, or was, advertising-supported, but it didn't lend itself to harassment and defamation nearly as easily, since a relatively small group of people had access to the megaphone. Likewise, online services don't have to encourage bad-faith speech, and discouraging it doesn't necessarily require a huge amount of labor: for example, eBay functions with minimal human oversight by limiting its feedback function to comments that go with an actual financial transaction. However, online search engines and recommendation systems typically use an advertising-based business model where customers pay for services with their attention rather than with money, and typically function with neither human supervision nor any design effort paid to discouraging defamation. Because of these two properties, it's relatively easy for anyone who's sufficiently determined to take control of what shows up when somebody looks up your name in the global distributed directory known as your favorite popular search engine -- that is, as long as you can't afford the public relations apparatus it takes to guard against such attacks. Harassment campaigns succeed to the extent that they exploit the ad-based business model and the absence of editorial oversight that characterize new media.

What This Article is Not About

Three topics I'm not addressing in this essay are:
  • Holding public figures accountable. When people talk about wanting to limit access to the megaphone that search engines make freely available to sufficiently persistent individuals, a common response is, "Are you saying you want to limit people's ability to hold powerful people accountable?" I think it's important for private citizens to be able to use the Internet to expose wrongdoing by powerful people, such as elected officials. I don't agree with the assumption behind this question: the assumption that private citizens ought to be exposed to the same level of public scrutiny as public figures are.
  • "Public shaming." What some people call "public shaming" refers to criticism of a person for a thing that person actually said. When Jon Ronson wrote about Justine Sacco getting "publicly shamed", he didn't mean that people falsely accused her of using her public platform to make a joke at the expense of people with AIDS. He and Sacco's critics agree that she did freely choose to make that joke. I'm talking about something different: when people use technology to construct a false narrative that portrays their adversary as having said something the adversary didn't say. This is not an article about "public shaming".

    The difference between defamation and shaming is defamation is defined by the behavior of the subject rather than the emotional reaction of the object; the latter sort of rests on this idea that it's wrong to make certain people feel certain ways, and I don't agree with that idea.

  • Censorship. I'm not advocating censorship when I ask how we entered a technological regime in which quality control for information retrieval algorithms is difficult or impossible without suppressing legitimate speech. I'm pointing out that we've designed ourselves into a system where no fine distinctions are possible, and the rapid dissemination of lies can't be curtailed without suppressing truth. As Sarah Jeong points out in her book The Internet of Garbage, the belief that discouraging harassment means encouraging censorship is founded on the false assumption that addressing harassment online means suppressing or deleting content. In fact, search engines already filter, prioritize, and otherwise implement heuristics about information quality. Some of the same technologies could be used to -- in Jeong's words -- dampen harassment and protect the targets of harassment. If you object to that, then surely you also object to the decisions encoded in information retrieval algorithms about what documents are most relevant to a query.

What's Next

So far, I've argued that social network infrastructure has two design flaws which serve to amplify rather than dampening harassment:
  • Lack of editorial oversight means that the barrier to entry to publishing has changed from being a journalist (while journalists have never been perfect, at least they're members of a profession with standards and ethics) to being someone with a little charisma and a lot of free time.
  • Advertising-supported business models means that a mildly charismatic, very bored antihero can find many bright people eager to help disseminate their lies because lies are provocative and provocative stories get clicks.

In the next three installments, I'll elaborate on how we got into this situation and what we could do to change it.

Do you like this post? Support me on Patreon and help me write more like it.


tim: Tim with short hair, smiling, wearing a black jacket over a white T-shirt (Default)
Tim Chevalier

March 2017

5 678910 11


RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags