Feb. 1st, 2016

tim: "System Status: Degraded" (degraded)
This post is the last in a 4-part series. The first three parts were "Defame and Blame", "Phone Books and Megaphones," and "Server-Side Economics."

Harassment as Externality

In part 3, I argued that online harassment is not an accident: it's something that service providers enable because it's profitable for them to let it happen. To know how to change that, we have to follow the money. There will be no reason to stop abuse online as long as advertisers are the customers of the services we rely on. To enter into a contract with a service you use and expect that the service provider will uphold their end of it, you have to be their customer, not their product. As their product, you have no more standing to enter into such a contract than do the underground cables that transmit content.

Harassment, then, is good for business -- at least as long as advertisers are customers and end users are raw material. If we want to change that, we'll need a radical change to the business models of most Internet companies, not shallow policy changes.

Deceptive Advertising

Why is false advertising something we broadly disapprove of -- something that's, in fact, illegal -- but spreading false information in order to entice more eyeballs to view advertisements isn't? Why is it illegal to run a TV ad that says "This toy will run without electricity or batteries," but not illegal for a social media site to surface the message, "Alice is a slut, and while we've got your attention, buy this toy?" In either case, it's lying in order to sell something.

Advertising will affect decision-making by Internet companies as long as advertising continues to be their primary revenue source. If you don't believe in the Easter Bunny, you shouldn't believe it either when executives tell you that ad money is a big bag of cash that Santa Claus delivers with no strings attached. Advertising incentivize ad-funded media to do whatever gets the most attention, regardless of truth. The choice to do what gets the most attention has ethical and political significance, because achieving that goal comes at the expense of other values.

Should spreading false information have a cost? Should dumping toxic waste have a cost? They both cost money and time to clean up. CDA 230 protects sites that profit from user-generated content from liability from paying any of the costs of that content, and maybe it's time to rethink that. A search engine is not like a common carrier -- one of the differences is that it allows one-to-many communication. There's a difference between building a phone system that any one person can use to call anyone else, and setting up an autodialer that lets the lucky 5th callee record a new message for it.

Accountability and Excuses

"Code is never neutral; it can inhibit and enhance certain kinds of speech over others. Where code fails, moderation has to step in."
-- Sarah Jeong, The Internet of Garbage
Have you ever gone to the DMV or called your health insurance company and been told "The computer is down" when, you suspected, the computer was working fine and it just wasn't in somebody's interest to help you right now? "It's just an algorithm" is "the computer is down," writ large. It's a great excuse for failure to do the work of making sure your tools don't reproduce the same oppressive patterns that characterize the underlying society in which those tools were built. And they will reproduce those patterns as long as you don't actively do the work of making sure they don't. Defamation and harassment disproportionately affect the most marginalized people, because those are exactly the people that you can bully with few or no consequences. Make it easier to harass people, to spread lies about them, and you are making it easier for people to perpetuate sexism and racism.

There are a number of tools that technical workers can use to help mitigate the tendency of the communities and the tools that they build to reproduce social inequality present in the world. Codes of conduct are one tool for reducing the tendency of subcultures to reproduce inequality that exists in their parent culture. For algorithms, human oversight could do the same -- people could regularly review search engine results in a way that includes verifying factual claims that are likely to have a negative impact on a person's life if the claims aren't true. It's also possible to imagine designing heuristics that address the credibility of a source rather than just its popularity. But all of this requires work, and it's not going to happen unless tech companies have an incentive to do that work.

A service-level agreement (SLA) is a contract between the provider and a service and the services' users that outlines what the users are entitled to expect from the service in exchange for their payment. Because people pay for most Web services with their attention (to ads) rather than with money, we don't usually think about SLAs for information quality. For an SLA to work, we would probably have to shift from an ad-based model to a subscription-based model for more services. We can measure how much money you spend on a service -- we can't measure how much attention you provide to its advertisers. So attention is a shaky basis on which to found a contract. Assuming business models where users pay in a more direct and transparent way for the services they consume, could we have SLAs for factual accuracy? Could we have an SLA for how many death threats or rape threats it's acceptable for a service to transmit?

I want to emphasize one more time that this article isn't about public shaming. The conversation that uses the words "public shaming" is about priorities, rather than truth. Some people want to be able to say what they feel like saying and get upset when others challenge them on it rather than politely ignoring it. When I talk about victims of defamation, that's not who I'm talking about -- I'm talking about people against whom attackers have weaponized online media in order to spread outright lies about them.

People who operate search engines already have search quality metrics. Could one of them be truth -- especially when it comes to queries that impinge on actual humans' reputations? Wikipedia has learned this lesson: its policy on biographies of living persons (BLP) didn't exist from the site's inception, but arose as a result of a series of cases in which people acting in bad faith used Wikipedia to libel people they didn't like. Wikipedia learned that if you let anybody edit an article, there are legal risks; the risks were (and continue to be) especially real for Wikipedia due to how highly many search engines rank it. To some extent, content providers have been able to protect themselves from those risks using CDA 230, but sitting back while people use your site to commit libel is still a bad look... at least if the targets are famous enough for anyone to care about them.

Code is Law

Making the Internet more accountable matters because, in the words of Lawrence Lessig, code is law. Increasingly, software automates decisions that affect our lives. Imagine if you had to obey laws, but weren't allowed to read their text. That's the situation we're in with code.

We recognize that the passenger in a hypothetical self-driving car programmed to run over anything in its path has made a choice: they turned the key to start the machine, even if from then on, they delegated responsibility to an algorithm. We correctly recognize the need for legal liability in this situation: otherwise, you could circumvent laws against murder by writing a program to commit murder instead of doing it yourself. Somehow, when physical objects are involved it's easier to understand that the person who turns the key, who deploys the code, has responsibility. It stops being "just the Internet" when the algorithms you designed and deployed start to determine what someone's potential employers think of them, regardless of truth.

There are no neutral algorithms. An algorithmic blank slate will inevitably reproduce the violence of the social structures in which it is embedded. Software designers have the choice of trying to design counterbalances to structural violence into their code, or to build tools that will amplify structural violence and inequality. There is no neutral choice; all technology is political. People who say they're apolitical just mean their political interests align well with the status quo.

Recommendation engines like YouTube, or any other search engine with relevance metrics and/or a recommendation system, just recognize patterns -- right? They don't create sexism; if they recommend sexist videos to people who aren't explicitly searching for them, that's because sexist videos are popular, right? YouTube isn't to blame for sexism, right?

Well... not exactly. An algorithm that recognizes patterns will recognize oppressive patterns, like the determination that some people have to silence women, discredit them, and pollute their agencies. Not only will it recognize those patterns, it will reproduce those patterns by helping people who want to silence women spread their message, which has a self-reinforcing effect: the more the algorithm recommends the content, the more people will view it, which reinforces the original recommendation. As Sarah Jeong wrote in The Internet of Garbage, "The Internet is presently siloed off into several major public platforms" -- public platforms that are privately owned. The people who own each silo own so many computing resources that competing with them would be infeasible for all but a very few -- thus, the free market will never solve this problem.

Companies like Google say they don't want to "be evil", but intending to "not be evil" is not enough. Google has an enormous amount of power, and little to no accountability -- no one who manages this public resource was elected democratically. There's no process for checking the power they have to neglect and ignore the ways in which their software participates in reproducing inequality. This happened by accident: a public good (the tools that make the Internet a useful source of knowledge) has fallen under private control. This would be a good time for breaking up a monopoly.

Persistent Identities

In the absence of anti-monopoly enforcement, is there anything we can do? I think there is. Anil Dash has written about persistent pseudonyms, a way to make it possible to communicate anonymously online while still standing to lose something of value if you abuse that privilege in order to spread false information. The Web site Metafilter charges a small amount of money to create an account, in order to discourage sockpuppeting (the practice of responding to being banned from a Web site by coming back to create a new account) -- it turns out this approach is very effective, since people who are engaging in harassment for laughs don't seem to value their own laughs very highly in terms of money.

I think advertising-based funding is also behind the reason why more sites don't implement persistent pseudonyms. The advertising-based business model encourages service providers to make it easy as possible for people to use their service; requiring the creation of an identity would put an obstacle in the way of immediate engagement. This is good from the perspective of nurturing quality content, but bad from the perspective that it limits the number of eyeballs that will be focused on ads. And thus, we see another way in which advertising enables harassment.

Again, this isn't a treatise against anonymity. None of what I'm saying implies you can't have 16 different identities for all the communities you participate in online. I am saying that I want it to be harder for you to use one of those identities for defamation without facing consequences.

A note on diversity

Twitter, Facebook, Google, and other social media and search companies are notoriously homogeneous, at least when it comes to their engineering staff and their executives, along gendered and racial lines. But what's funny is that Twitter, Facebook, and other sites that make money by using user-generated content to attract an audience for advertisements, are happy to use the free labor that a diversity of people do for them when they create content (that is, write tweets or status updates). The leaders of these companies recognize that they couldn't possibly hire a collection of writers who would generate better content than the masses do -- and anyway, even if they could, writers usually want to be paid. So they recognize the value of diversity and are happy to reap its benefits. They're not so enthusiastic to hire a diverse range of people, since that would mean sharing profits with people who aren't like themselves.

And so here's a reason why diversity means something. People who build complex information systems based on approximations and heuristics have failed to incorporate credibility into their designs. Almost uniformly, they design algorithms that will promote whatever content gets the most attention, regardless of its accuracy. Why would they do otherwise? Telling the truth doesn't attract an audience for advertisers. On the other hand, there is a limit to how much harm an online service can do before the people whose attention they're trying to sell -- their users -- get annoyed and start to leave. We're seeing that happen with Twitter already. If Twitter's engineers and product designers had included more people in demographics that are vulnerable to attacks on their credibility (starting with women, non-binary people, and men of color), then they'd have a more sustainable business, even if it would be less profitable in the short term. Excluding people on the basis of race and gender hurts everyone: it results in technical decisions that cause demonstrable harm, as well as alienating people who might otherwise keep using a service and keep providing attention to sell to advertisers.

Internalizing the Externalities

In the same way that companies that pollute the environment profit by externalizing the costs of their actions (they get to enjoy all the profit, but the external world -- the government and taxpayers -- get saddled with the responsibility of cleaning up the mess), Internet companies get to profit by externalizing the cost of transmitting bad-faith speech. Their profits are higher because no one expects them to spend time incorporating human oversight into pattern recognition. The people who actually generate bad-faith speech get to externalize the costs of their speech as well. It's the victims who pay.

We can't stop people from harassing or abusing others, or from lying. But we can make it harder for them to do it consequence-free. Let's not let the perfect be the enemy of the good. Analogously, codes of conduct don't prevent bad actions -- rather, they give people assurance that justice will be done and harmful actions will have consequences. Creating a link between actions and consequences is what justice is about; it's not about creating dark corners and looking the other way as bullies arrive to beat people up in those corners.

...the unique force-multiplying effects of the Internet are underestimated. There’s a difference between info buried in small font in a dense book of which only a few thousand copies exist in a relatively small geographic location versus blasting this data out online where anyone with a net connection anywhere in the world can access it.
-- Katherine Cross, "'Things Have Happened In The Past Week': On Doxing, Swatting, And 8chan":
When we protect content providers from liability for the content that they have this force-multiplying effect on, our priorities are misplaced. With power comes responsibility; currently, content providers have enormous power to boost some signals while dampening others, and the fact that these decisions are often automated and always motivated by profit rather than pure ideology doesn't reduce the need to balance that power with accountability.
"The technical architecture of online platforms... should be designed to dampen harassing behavior, while shielding targets from harassing content. It means creating technical friction in orchestrating a sustained campaign on a platform, or engaging in sustained hounding."
-- Sarah Jeong, The Internet of Garbage
That our existing platforms neither dampen nor shield isn't an accident -- dampening harassing behavior would limit the audience for the advertisements that can be attached to the products of that harassing behavior. Indeed, they don't just fail to dampen, they do the opposite: they amplify the signals of harassment. At the point where an algorithm starts to give a pattern a life of its own -- starts to strengthen a signal rather than merely repeating it -- it's time to assign more responsibility to companies that trade in user-generated content than we traditionally have. To build a recommendation system that suggests particular videos are worth watching is different from building a database that lets people upload videos and hand URLs for those videos off to their friends. Recommendation systems, automated or not, create value judgments. And the value judgments they surface have an irrevocable effect on the world. Helping content get more eyeballs is an active process, whether or not it's implemented by algorithms people see as passive.

There is no hope of addressing the problem of harassment as long as it continues to be an externality for the businesses that profit from enabling it. Whether by supporting subscription-based services with our money and declining to give our attention to advertising-based surfaces, or expanding legal liability for the signals that a service selectively amplifies, or by normalizing the use of persistent pseudonyms, people will continue to have their lives limited by Internet defamation campaigns as long as media companies can profit from such campaigns without paying their costs.


Do you like this post? Support me on Patreon and help me write more like it.

Profile

tim: Tim with short hair, smiling, wearing a black jacket over a white T-shirt (Default)
Tim Chevalier

December 2018

S M T W T F S
      1
2345 678
9101112131415
16171819202122
23242526272829
3031     

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags