Server-Side EconomicsIn "Phone Books and Megaphones", I talked about easy access to the megaphone. We can't just blame the people who eagerly pick up the megaphone when it's offered for the content of their speech -- we also have to look at the people who own the megaphone, and why they're so eager to lend it out.
It's not an accident that Internet companies are loathe to regulate harassment and defamation. There are economic incentives for the owners of communication channels to disseminate defamation: they make money from doing it, and don't lose money or credibility in the process. There are few incentives for the owners of these channels to maintain their reputations by fact-checking the information they distribute.
I see three major reasons why it's so easy for false information to spread:
- Economic incentives to distribute any information that gets attention, regardless of its truth.
- The public's learned helplessness in the face of software, which makes it easy for service owners to claim there's nothing they can do about defamation. By treating the algorithms they themselves implemented as black boxes, their designers can disclaim responsibility for the actions of the machines they set into motion.
- Algorithmic opacity, which keeps the public uninformed about how code works and makes it more likely they'll believe that it's "the computers fault" and people can't change anything.
Incentives and Trade-OffsConsider email spam as a cautionary tale. Spam and abuse are both economic problems. The problem of spam arose because the person who sends an email doesn't pay the cost of transmitting it to the recipient. This creates an incentive to use other people's resources to advertise your product for free. Likewise, harassers can spam the noosphere with lies, as they continue to do in the context of GamerGate, and never pay the cost of their mendacity. Even if your lies get exposed, they won't be billed to your reputation -- not if you're using a disposable identity, or if you're delegating the work to a crowd of people using disposable identities (proxy recruitment). The latter is similar to how spammers use botnets to get computers around the world to send spam for them, usually unbeknownst to the computers' owners -- except rather than using viral code to co-opt a machine into a botnet, harassers use viral ideas to recruit proxies.
In The Internet of Garbage, Sarah Jeong discusses the parallels between spam and abuse at length. She asks why the massive engineering effort that's been put towards curbing spam -- mostly successfully, at least in the sense of saving users from the time it takes to manually filter spam (Internet service providers still pay the high cost of transmitting it, only to be filtered out at the client side) -- hasn't been applied to the abuse problem. I think the reason is pretty simple: spam costs money, but abuse makes money. By definition, almost nobody wants to see spam (a tiny percentage of people do, which is why it's still rewarding for spammers to try). But lots of people want to see provocative rumors, especially when those rumors reinforce their sexist or racist biases. In "Trouble at the Koolaid Point", Kathy Sierra wrote about the incentives for men to harass women online: a belief that any woman who gets attention for her work must not deserve it, must have tricked people into believing her work has value. This doesn't create an economic incentive for harassment, but it does create an incentive -- meanwhile, if you get more traffic to your site and more advertising money because someone's using it to spread GamerGate-style lies, you're not going to complain. Unless you follow a strong ethical code, of course, but tech people generally don't. Putting ethics ahead of profit would betray your investors, or your shareholders.
If harassment succeeds because there's an economic incentive to let it pass through your network, we have to fight it economically as well. Moralizing about why you shouldn't let your platform enable harassment won't help, since the platform owners have no shame.
Creating these incentives matters. Currently, there's a world-writeable database with everyone's names as the keys, with no accounting and no authentication. A few people control it and a few people get the profits. We shrug our shoulders and say "how can we trace the person who injected this piece of false information into the system? There's no way to track people down." But somebody made the decision to build a system in which people can speak with no incentive to be truthful. Alternative designs are possible.
Autonomous Cars, Autonomous CodeAnother reason why there's so little economic incentive to control libel is that the public has a sort of learned helplessness about algorithms... at least when it's "just" information that those algorithms manipulate. We wouldn't ask why a search engine returns the top results that it returns for a particular query (unless we study information retrieval), because we assume that algorithms are objective and neutral, that they don't reproduce the biases of the humans who built them.
In part 2, I talked about why "it's just an algorithm" isn't a valid answer to questions about the design choices that underlie algorithms. We recognize this better for algorithms that aren't purely about producing and consuming information. We recognize that despite being controlled by algorithms, self-driving cars have consequences for legal liability. It's easy to empathize with the threat that cars pose to our lives, and we're correctly disturbed by the idea that you or someone you love could be harmed or killed by a robot who can't be held accountable for it. Of course, we know that the people who designed those machines can be held accountable if they create software that accidentally harms people through bugs, or deliberately harms people by design.
Imagine a self-driving car designer who programmed the machines to act in bad faith: for example, to take risks to get the car's passenger to their destination sooner at the potential expense of other people on the road. You wouldn't say "it's just an algorithm, right?" Now, what if people died due to unforeseen consequences of how self-driving car designers wrote their software rather than deliberate malice? You still wouldn't say, "It's just an algorithm, right?" You would hold the software designers liable for their failure to test their work adequately. Clearly, the reason why you would react the same way in the good-faith scenario as in the bad-faith one is the effect of the poor decision, rather than whether the intent was malicious or less careless.
Algorithms that are as autonomous as self-driving cars, and perhaps less transparent, control your reputation. Unlike with self-driving cars, no one is talking about liability for what happens when they turn your reputation into a pile of burning wreckage.
Algorithms are also incredibly flexible and changeable. Changing code requires people to think and to have discussions with each other, but it doesn't require much attention to the laws of physics and other than paying humans for their time, it has little cost. Exploiting the majority's lack of familiarity with code in order to act as if having to modify software is a huge burden is a good way to avoid work, but a bad way to tend the garden of knowledge.
Plausible DeniabilityDesigners and implementors of information retrieval algorithms, then, enjoy a certain degree of plausible deniability that designers of algorithms to control self-driving cars (or robots or trains or medical devices) do not.
During the AmazonFail incident in which an (apparent) bug in Amazon's search software caused books on GLBT-related topics to be miscategorized as "adult" and hidden from searches, defenders of Amazon cried "It's just an algorithm." The algorithm didn't hate queer people, they said. It wasn't out to get you. It was just a computer doing what it had programmed to do. You can't hold a computer responsible.
"It's just an algorithm" is the natural successor to the magical intent theory of communication. Since your intent cannot be known to someone else (unless you tell them -- but then, you could lie about it), citing your good intent is often an effective way to dodge responsibility for bad actions. Delegating actions to algorithms takes the person out of the picture altogether: if people with power delegate all of their actions to inanimate objects, which lack intentionality, then no one (no one who has power, anyway) has to be responsible for anything.
"It's just an algorithm" is also a shaming mechanism, because it implies that the complainer is naïve enough to think that computers are conscious. But nobody thinks algorithms can be malicious. So saying, "it's just an algorithm, it doesn't mean you harm" is a response to something nobody said. Rather, when we complain about the outcomes of algorithms, we complain about a choice that was made by not making a choice. In the context of this article, it's the choice to not design systems with an eye towards their potential use for harassment and defamation and possible ways to mitigate those risks. People make this decision all the time, over and over, including for systems being designed today -- when there's enough past experience that everybody ought to know better.
Plausible deniability matters because it provides the moral escape hatch from responsibility for defamation campaigns, on the part of people who own search engines and social media sites. (There's also a legal escape hatch from responsibility, at least in the US: CDA Section 230, which shields every "provider or user of an interactive computer service" from liability for "any information provided by another information content provider.") Plausible deniability is the escape hatch, and advertising is the economic incentive to use that escape hatch. Combined with algorithm opacity, they create a powerful set of incentives for online service providers to profit from defamation campaigns. Anything that attracts attention to a Web site (and, therefore, to the advertisements on it) is worth boosting. Since there are no penalties for boosting harmful, false information, search and recommendation algorithms are amplifiers of false information by design -- there was never any reason to design them not to elevate false but provocative content.
TransparencyI've shown that information retrieval algorithms tend to be bad at limiting the spread of false information because doing the work to curb defamation can't be easily monetized, and because people have low expectations for software and don't hold its creators responsible for their actions. A third reason is that the lack of visibility of the internals of large systems has a chilling effect on public criticism of them.
Plausible deniability and algorithmic opacity go hand in hand. In "Why Algorithm Transparency is Vital to the Future of Thinking", Rachel Shadoan explains in detail what it means for algorithms to be transparent or opaque. The information retrieval algorithms I've been talking about are opaque. Indeed, we're so used to centralized control of search engines and databases that it's hard for them to imagine them being otherwise.
"In the current internet ecosystem, we–the users–are not customers. We are product, packaged and sold to advertisers for the benefit of shareholders. This, in combination with the opacity of the algorithms that facilitate these services, creates an incentive structure where our ability to access information can easily fall prey to a company’s desire for profit."In an interview, Chelsea Manning commented on this problem as well:
-- Rachel Shadoan
"Algorithms are used to try and find connections among the incomprehensible 'big data' pools that we now gather regularly. Like a scalpel, they're supposed to slice through the data and surgically extract an answer or a prediction to a very narrow question of our choosing—such as which neighborhood to put more police resources into, where terrorists are likely to be hiding, or which potential loan recipients are most likely to default. But—and we often forget this—these algorithms are limited to determining the likelihood or chance based on a correlation, and are not a foregone conclusion. They are also based on the biases created by the algorithm's developer....Opacity results from the ownership of search technology by a few private companies, and their desire not to share their intellectual property. If users were the customers of companies like Google, there would be more of an incentive to design algorithms that use heuristics to detect false information that damages people's credibility. Because advertisers are the customers, and because defamation generally doesn't affect advertisers negatively (unless the advertiser itself is being defamed), there is no economic incentive to do this work. And because people don't understand how algorithms work, and couldn't understand any of the search engines they used even if they wanted to (since the code is closed-source), it's much easier for them to accept the spread of false information as an inevitable consequence of technological progress.
These algorithms are even more dangerous when they happen to be proprietary 'black boxes.' This means they cannot be examined by the public. Flaws in algorithms, concerning criminal justice, voting, or military and intelligence, can drastically affect huge populations in our society. Yet, since they are not made open to the public, we often have no idea whether or not they are behaving fairly, and not creating unintended consequences—let alone deliberate and malicious consequences."
-- Chelsea Manning, BoingBoing interview by Cory Doctorow
Manning's comments, especially, show why the three problems of economic incentives, plausible deniability, and opacity are interconnected. Economics give Internet companies a reason to distribute false information. Plausible deniability means that the people who own those companies can dodge any blame or shame by assigning fault to the algorithms. And opacity means nobody can ask for the people who design and implement the algorithms to do better, because you can't critique the algorithm if you can't see the source code in the first place.
It doesn't have to be this way. In part 4, I'll suggest a few possibilities for making the Internet a more trustworthy, accountable, and humane medium.
To be continued.
Do you like this post? Support me on Patreon and help me write more like it.