Hate speech on social media

post img

Checked : Olivia S. , Vallary O.

Latest Update 21 Jan, 2024

Table of content

Hardly anybody would march up to a crowd in person and announce their hate for all members of a certain ethnic group, race or religion just out of the blue. However, this is sometimes an everyday occurrence online, where the anonymity of an online profile affords users the confidence to make bold statements on social media.

One would think that in this day and age, we live in relative harmony at least in our real lives. While people may still have their differences, they have mostly learned to keep it civil in the community and at the workplace. Most countries have laws protecting minority groups from discrimination, and there are numerous organizations that champion freedom, tolerance and diversity. However, with the prevalence of hate speech in the online world, one may question if people really are living in harmony these days or if they have simply taken their quarrels elsewhere.

Social media   is a golden platform for connecting people and allowing them to network, share news and air their personal opinions. Unfortunately, it is often also filled with negativity, one of which is hate speech. To share one’s personal views on social media is to open oneself up to a barrage of online attacks, sometimes fighting fire with fire. Many arguments over hate speech escalate into “flame wars”, where most commenters may not even remember what they were originally debating about, but rather resort to personal attacks and sometimes even threats. If words could kill, the participants would probably be dead many times over. Whether hate speech targets a single person or a group of people, it is undeniably destructive and violent.

What constitutes hate speech? There is no definite set of rules that can reliably identify hate speech for what it is. At its core, hate speech tends to attack people for having certain characteristics, such as their race, skin color, ethnic group, religion, gender or sexual orientation – essentially harassing, intimidating or calling for violence against people for who they are. Some hate speech comments may be general – for example, “I hate all members of a certain race”. Others may be targeted at a single person or generalizing all members of a group that the person belongs to. Hate speech can occur in varying severity, from voicing one’s opinion about a certain group or person to threatening them with violence, including statements about wanting to kill them.

There are also other forms of hate speech, such as the posting of violent images or images with such a connotation. For instance, some anti-Semitic groups post images of the Holocaust, including gas chambers, Nazi symbols or Adolf Hitler. Anti-black groups may post gory images from the Jim Crow era of African American victims. Even if no “speech” is technically used in such posts, the message is quite clear and is commonly regarded as hate speech.

The wealth of information on the Internet may have played its part in inciting hate speech. Many individuals still harbor prejudiced feelings towards members of certain demographic groups. For instance, racism, misogyny or homophobia very likely still lurk in the hearts of many, even if it is not visible in their daily lives. The Internet is home to critics, toxicity and “triggers” that can goad the inclined to violence. Additionally, word spreads quickly on social media, allowing people the ability to spread their ideas, threats and violent actions quickly – most of which tend to go viral once the online population fans the flames.

Some sociologists believe that the way social media platforms work is not helping to curb the spread of hate speech either. Social media platforms make money off advertising. These depend on maximizing the user’s browsing time so that more advertisements will be fed to them. Since it is in the social media platforms’ best interests to keep users on the page for as long as possible, algorithms are used to direct users to pages and media they would likely want to browse. This makes it easy for a user to chance upon violent or extremist views, which can then lead them to more similar posts.

Unfortunately, it often turns out that the more time people spend around these kinds of negative sentiments on social media, the more inclined they may be to air their honest views. In fact, the open policy of the Internet encourages people to post and respond freely, especially on social media pages and comments sections which are not moderated. Rarely is hate speech incited on its own. Usually,   media   coverage of an ongoing hot topic tends to provoke responses which are critical and condescending, which then provoke more responses and culminate in a chain reaction of hate speech. The media may also tend to paint some easy targets in a more negative light, resulting in stronger feelings from the public and thus stronger responses.

Under the policy of freedom of speech, hate speech is not illegal. In fact, hate speech is constitutionally protected in the United States under the First Amendment, as long as it does not contain a threat of violence. Most social media platforms and online forums are privately owned, allowing the owners to restrict the content allowed on their platform as per their own terms of service. However, while forums may be more strictly policed, social media tends to be looser in terms of what it defines as hate speech, since it deals with much higher volumes of content and is essentially each user’s own personal “space”.

Even platforms with the strictest policies on hate speech usually have to moderate it manually due to the many forms hate speech can take on. Systems can be trained to recognize word patterns, but it is more difficult to discern the context or intentions of those words. With the vast number of posts on popular social media platforms, it is usually up to individual users to report offensive posts and optionally block the offending user.

image banner

We Will Write an Essay for You Quickly

Some may say that hate speech is harmless as long as it remains a verbal assault on the Internet. After all, the Internet is not meant for those with thin skin. Since hate speech is technically protected under freedom of speech, why should people not be allowed to air their personal opinions publicly?

It appears that the prevalence of hate speech can easily provoke people to violence. In recent years, there have been a number of hate crimes that have come about as a result of hate speech on social media. In June 2015, Dylann Roof shot and killed nine African American worshippers at a church in Charleston, thereafter describing his act in vivid detail and without remorse in a video which he posted on social media. It was believed that Roof had “self-radicalized” online through material containing violent white supremacist beliefs on the Internet, leading him to think that an appropriate course of action would be to resort to violence.

The Charleston church shooting is only one of many cases where perpetrators consumed online media that led them to develop extreme beliefs, leading to violent courses of action. While people are certainly entitled to their personal opinions, we should not underestimate the potential of social media in dividing mankind.

Looking for a Skilled Essay Writer?

creator avatar

  • Mississippi College Bachelor of Arts (BA)

No reviews yet, be the first to write your comment

Write your review

Thanks for review.

It will be published after moderation

Latest News

article image

What happens in the brain when learning?

10 min read

20 Jan, 2024

article image

How Relativism Promotes Pluralism and Tolerance

article image

Everything you need to know about short-term memory

  • Foreign Affairs
  • CFR Education
  • Newsletters

Council of Councils

Climate Change

Global Climate Agreements: Successes and Failures

Backgrounder by Lindsay Maizland December 5, 2023 Renewing America

  • Defense & Security
  • Diplomacy & International Institutions
  • Energy & Environment
  • Human Rights
  • Politics & Government
  • Social Issues

Myanmar’s Troubled History: Coups, Military Rule, and Ethnic Conflict

Backgrounder by Lindsay Maizland January 31, 2022

  • Europe & Eurasia
  • Global Commons
  • Middle East & North Africa
  • Sub-Saharan Africa

How Tobacco Laws Could Help Close the Racial Gap on Cancer

Interactive by Olivia Angelino, Thomas J. Bollyky , Elle Ruggiero and Isabella Turilli February 1, 2023 Global Health Program

  • Backgrounders
  • Special Projects

China’s Stockpiling and Mobilization Measures for Competition and Conflict Link

Featuring Zongyuan Zoe Liu via U.S.-China Economic and Security Review Commission June 13, 2024

  • Centers & Programs
  • Books & Reports
  • Independent Task Force Program
  • Fellowships

Oil and Petroleum Products

Academic Webinar: The Geopolitics of Oil

Webinar with Carolyn Kissane and Irina A. Faskianos April 12, 2023

  • State & Local Officials
  • Religion Leaders
  • Local Journalists

The Rise in LGBTQ+ Hate and Democratic Backsliding

Event with Graeme Reid, Ari Shaw, Maria Sjödin and Nancy Yao June 4, 2024

  • Lectureship Series
  • Webinars & Conference Calls
  • Member Login

Hate Speech on Social Media: Global Comparisons

A memorial outside Al Noor mosque in Christchurch, New Zealand.

  • Hate speech online has been linked to a global increase in violence toward minorities, including mass shootings, lynchings, and ethnic cleansing.
  • Policies used to curb hate speech risk limiting free speech and are inconsistently enforced.
  • Countries such as the United States grant social media companies broad powers in managing their content and enforcing hate speech rules. Others, including Germany, can force companies to remove posts within certain time periods.

Introduction

A mounting number of attacks on immigrants and other minorities has raised new concerns about the connection between inflammatory speech online and violent acts, as well as the role of corporations and the state in policing speech. Analysts say trends in hate crimes around the world echo changes in the political climate, and that social media can magnify discord. At their most extreme, rumors and invective disseminated online have contributed to violence ranging from lynchings to ethnic cleansing.

The response has been uneven, and the task of deciding what to censor, and how, has largely fallen to the handful of corporations that control the platforms on which much of the world now communicates. But these companies are constrained by domestic laws. In liberal democracies, these laws can serve to defuse discrimination and head off violence against minorities. But such laws can also be used to suppress minorities and dissidents.

How widespread is the problem?

  • Radicalization and Extremism
  • Social Media
  • Race and Ethnicity
  • Censorship and Freedom of Expression
  • Digital Policy

Incidents have been reported on nearly every continent. Much of the world now communicates on social media, with nearly a third of the world’s population active on Facebook alone. As more and more people have moved online, experts say, individuals inclined toward racism, misogyny, or homophobia have found niches that can reinforce their views and goad them to violence. Social media platforms also offer violent actors the opportunity to publicize their acts.

A bar chart of the percent agreeing "people should be able to make statements that are offensive to minority groups publicly" showing the U.S. with 67% in agreement

Social scientists and others have observed how social media posts, and other online speech, can inspire acts of violence:

  • In Germany a correlation was found between anti-refugee Facebook posts by the far-right Alternative for Germany party and attacks on refugees. Scholars Karsten Muller and Carlo Schwarz observed that upticks in attacks, such as arson and assault, followed spikes in hate-mongering posts .
  • In the United States, perpetrators of recent white supremacist attacks have circulated among racist communities online, and also embraced social media to publicize their acts. Prosecutors said the Charleston church shooter , who killed nine black clergy and worshippers in June 2015, engaged in a “ self-learning process ” online that led him to believe that the goal of white supremacy required violent action.
  • The 2018 Pittsburgh synagogue shooter was a participant in the social media network Gab , whose lax rules have attracted extremists banned by larger platforms. There, he espoused the conspiracy that Jews sought to bring immigrants into the United States, and render whites a minority, before killing eleven worshippers at a refugee-themed Shabbat service. This “great replacement” trope, which was heard at the white supremacist rally in Charlottesville, Virginia, a year prior and originates with the French far right , expresses demographic anxieties about nonwhite immigration and birth rates.
  • The great replacement trope was in turn espoused by the perpetrator of the 2019 New Zealand mosque shootings, who killed forty-nine Muslims at prayer and sought to broadcast the attack on YouTube.
  • In Myanmar, military leaders and Buddhist nationalists used social media to slur and demonize the Rohingya Muslim minority ahead of and during a campaign of ethnic cleansing . Though Rohingya comprised perhaps 2 percent of the population, ethnonationalists claimed that Rohingya would soon supplant the Buddhist majority. The UN fact-finding mission said, “Facebook has been a useful instrument for those seeking to spread hate, in a context where, for most users, Facebook is the Internet [PDF].”
  • In India, lynch mobs and other types of communal violence, in many cases originating with rumors on WhatsApp groups , have been on the rise since the Hindu-nationalist Bharatiya Janata Party (BJP) came to power in 2014.
  • Sri Lanka has similarly seen vigilantism inspired by rumors spread online, targeting the Tamil Muslim minority. During a spate of violence in March 2018, the government blocked access to Facebook and WhatsApp, as well as the messaging app Viber, for a week, saying that Facebook had not been sufficiently responsive during the emergency.

Does social media catalyze hate crimes?

The same technology that allows social media to galvanize democracy activists can be used by hate groups seeking to organize and recruit. It also allows fringe sites, including peddlers of conspiracies, to reach audiences far broader than their core readership. Online platforms’ business models depend on maximizing reading or viewing times. Since Facebook and similar platforms make their money by enabling advertisers to target audiences with extreme precision, it is in their interests to let people find the communities where they will spend the most time.

Users’ experiences online are mediated by algorithms designed to maximize their engagement, which often inadvertently promote extreme content. Some web watchdog groups say YouTube’s autoplay function, in which the player, at the end of one video, tees up a related one, can be especially pernicious. The algorithm drives people to videos that promote conspiracy theories or are otherwise “ divisive, misleading or false ,” according to a Wall Street Journal investigative report. “YouTube may be one of the most powerful radicalizing instruments of the 21st century,” writes sociologist Zeynep Tufekci .

YouTube said in June 2019 that changes to its recommendation algorithm made in January had halved views of videos deemed “borderline content” for spreading misinformation. At that time, the company also announced that it would remove neo-Nazi and white supremacist videos from its site. Yet the platform faced criticism that its efforts to curb hate speech do not go far enough. For instance, critics note that rather than removing videos that provoked homophobic harassment of a journalist, YouTube instead cut off the offending user from sharing in advertising revenue.  

How do platforms enforce their rules?

Social media platforms rely on a combination of artificial intelligence, user reporting, and staff known as content moderators to enforce their rules regarding appropriate content. Moderators, however, are burdened by the sheer volume of content and the trauma that comes from sifting through disturbing posts , and social media companies don’t evenly devote resources across the many markets they serve.

A ProPublica investigation found that Facebook’s rules are opaque to users and inconsistently applied by its thousands of contractors charged with content moderation. (Facebook says there are fifteen thousand.) In many countries and disputed territories, such as the Palestinian territories, Kashmir, and Crimea, activists and journalists have found themselves censored , as Facebook has sought to maintain access to national markets or to insulate itself from legal liability. “The company’s hate-speech rules tend to favor elites and governments over grassroots activists and racial minorities,” ProPublica found.

Daily News Brief

A summary of global news developments with cfr analysis delivered to your inbox each morning.  weekdays., the world this week, a weekly digest of the latest from cfr on the biggest foreign policy stories of the week, featuring briefs, opinions, and explainers. every friday., think global health.

A curation of original analyses, data visualizations, and commentaries, examining the debates and efforts to improve health worldwide.  Weekly.

Addressing the challenges of navigating varying legal systems and standards around the world—and facing investigations by several governments—Facebook CEO Mark Zuckerberg called for global regulations to establish baseline content, electoral integrity, privacy, and data standards.

Problems also arise when platforms’ artificial intelligence is poorly adapted to local languages and companies have invested little in staff fluent in them. This was particularly acute in Myanmar, where, Reuters reported, Facebook employed just two Burmese speakers as of early 2015. After a series of anti-Muslim violence began in 2012, experts warned of the fertile environment ultranationalist Buddhist monks found on Facebook for disseminating hate speech to an audience newly connected to the internet after decades under a closed autocratic system.

Facebook admitted it had done too little after seven hundred thousand Rohingya were driven to Bangladesh and a UN human rights panel singled out the company in a report saying Myanmar’s security forces should be investigated for genocidal intent. In August 2018, it banned military officials from the platform and pledged to increase the number of moderators fluent in the local language.

How do countries regulate hate speech online?

In many ways, the debates confronting courts, legislatures, and publics about how to reconcile the competing values of free expression and nondiscrimination have been around for a century or longer. Democracies have varied in their philosophical approaches to these questions, as rapidly changing communications technologies have raised technical challenges of monitoring and responding to incitement and dangerous disinformation.

United States. Social media platforms have broad latitude [PDF], each establishing its own standards for content and methods of enforcement. Their broad discretion stems from the Communications Decency Act . The 1996 law exempts tech platforms from liability for actionable speech by their users. Magazines and television networks, for example, can be sued for publishing defamatory information they know to be false; social media platforms cannot be found similarly liable for content they host.

A list of data points on Americans' level of concern over online hate speech, including that 59% believe online hate and harassment make hate crimes more common.

Recent congressional hearings have highlighted the chasm between Democrats and Republicans on the issue. House Judiciary Committee Chairman Jerry Nadler convened a hearing in the aftermath of the New Zealand attack, saying the internet has aided white nationalism’s international proliferation. “The President’s rhetoric fans the flames with language that—whether intentional or not—may motivate and embolden white supremacist movements,” he said, a charge Republicans on the panel disputed. The Senate Judiciary Committee, led by Ted Cruz, held a nearly simultaneous hearing in which he alleged that major social media companies’ rules disproportionately censor conservative speech , threatening the platforms with federal regulation. Democrats on that panel said Republicans seek to weaken policies  dealing with hate speech and disinformation that instead ought to be strengthened.

European Union. The bloc’s twenty-eight members all legislate the issue of hate speech on social media differently, but they adhere to some common principles. Unlike the United States, it is not only speech that directly incites violence that comes under scrutiny; so too does speech that incites hatred or denies or minimizes genocide and crimes against humanity. Backlash against the millions of predominantly Muslim migrants and refugees who have arrived in Europe in recent years has made this a particularly salient issue, as has an uptick in anti-Semitic incidents in countries including France, Germany, and the United Kingdom.

In a bid to preempt bloc-wide legislation, major tech companies agreed to a code of conduct with the European Union in which they pledged to review posts flagged by users and take down those that violate EU standards within twenty-four hours. In a February 2019 review, the European Commission found that social media platforms were meeting this requirement in three-quarters of cases .

The Nazi legacy has made Germany especially sensitive to hate speech. A 2018 law requires large social media platforms to take down posts that are “manifestly illegal” under criteria set out in German law within twenty-four hours. Human Rights Watch raised concerns that the threat of hefty fines would encourage the social media platforms to be “overzealous censors.”

New regulations under consideration by the bloc’s executive arm would extend a model similar to Germany’s across the EU, with the intent of “preventing the dissemination of terrorist content online .” Civil libertarians have warned against the measure for its “ vague and broad ” definitions of prohibited content, as well as for making private corporations, rather than public authorities, the arbiters of censorship.

India. Under new social media rules, the government can order platforms to take down posts within twenty-four hours based on a wide range of offenses, as well as to obtain the identity of the user. As social media platforms have made efforts to stanch the sort of speech that has led to vigilante violence, lawmakers from the ruling BJP have accused them of censoring content in a politically discriminatory manner, disproportionately suspending right-wing accounts, and thus undermining Indian democracy . Critics of the BJP accuse it of deflecting blame from party elites to the platforms hosting them. As of April 2018, the New Delhi–based Association for Democratic Reforms had identified fifty-eight lawmakers facing hate speech cases, including twenty-seven from the ruling BJP. The opposition has expressed unease with potential government intrusions into privacy.

Japan. Hate speech has become a subject of legislation and jurisprudence in Japan in the past decade [PDF], as anti-racism activists have challenged ultranationalist agitation against ethnic Koreans. This attention to the issue attracted a rebuke from the UN Committee on the Elimination of Racial Discrimination in 2014 and inspired a national ban on hate speech in 2016, with the government adopting a model similar to Europe’s. Rather than specify criminal penalties, however, it delegates to municipal governments the responsibility “to eliminate unjust discriminatory words and deeds against People from Outside Japan.” A handful of recent cases concerning ethnic Koreans could pose a test: in one, the Osaka government ordered a website containing videos deemed hateful taken down , and in Kanagawa and Okinawa Prefectures courts have fined individuals convicted of defaming ethnic Koreans in anonymous online posts.

What are the prospects for international prosecution?

Cases of genocide and crimes against humanity could be the next frontier of social media jurisprudence, drawing on precedents set in Nuremberg and Rwanda. The Nuremberg trials in post-Nazi Germany convicted the publisher of the newspaper Der Sturmer ; the 1948 Genocide Convention subsequently included “ direct and public incitement to commit genocide ” as a crime. During the UN International Criminal Tribunal for Rwanda, two media executives were convicted on those grounds. As prosecutors look ahead to potential genocide and war crimes tribunals for cases such as Myanmar, social media users with mass followings could be found similarly criminally liable.

Recommended Resources

Andrew Sellars sorts through attempts to define hate speech .

Columbia University compiles relevant case law from around the world.

The U.S. Holocaust Memorial Museum lays out the legal history of incitement to genocide.

Kate Klonick describes how private platforms have come to govern public speech .

Timothy McLaughlin chronicles Facebook’s role in atrocities against Rohingya in Myanmar.

Adrian Chen reports on the psychological toll of content moderation on contract workers.

Tarleton Gillespie discusses the politics of content moderation .

  • Technology and Innovation

More From Our Experts

How Will the EU Elections Results Change Europe?

In Brief by Liana Fix June 10, 2024

Iran Attack Means an Even Tougher Balancing Act for the U.S. in the Middle East

In Brief by Steven A. Cook April 14, 2024 Middle East Program

Iran Attacks on Israel Spur Escalation Concerns

In Brief by Ray Takeyh April 14, 2024 Middle East Program

Top Stories on CFR

China Strategy Initiative Link

via Council on Foreign Relations June 24, 2024

What Is the Extent of Sudan’s Humanitarian Crisis?

In Brief by Mariel Ferragamo and Diana Roy June 26, 2024

Does Iran’s Presidential Election Matter?

Expert Brief by Ray Takeyh June 25, 2024

Last updated 27/06/24: Online ordering is currently unavailable due to technical issues. We apologise for any delays responding to customers while we resolve this. For further updates please visit our website: https://www.cambridge.org/news-and-insights/technical-incident

We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings .

Login Alert

hate speech on social media essay

  • > Social Media and Democracy
  • > Online Hate Speech

hate speech on social media essay

Book contents

  • Social Media and Democracy
  • SSRC Anxieties of Democracy
  • Sponsored by the Social Science Research Council
  • Copyright page
  • Contributors
  • 1 Introduction
  • 2 Misinformation, Disinformation, and Online Propaganda
  • 3 Social Media, Echo Chambers, and Political Polarization
  • 4 Online Hate Speech
  • 5 Bots and Computational Propaganda: Automation for Communication and Control
  • 6 Online Political Advertising in the United States
  • 7 Democratic Creative Destruction? The Effect of a Changing Media Landscape on Democracy
  • 8 Misinformation and Its Correction
  • 9 Comparative Media Regulation in the United States and Europe
  • 10 Facts and Where to Find Them: Empirical Research on Internet Platforms and Content Moderation
  • 11 Dealing with Disinformation: Evaluating the Case for Amendment of Section 230 of the Communications Decency Act
  • 12 Democratic Transparency in the Platform Society
  • 13 Conclusion: The Challenges and Opportunities for Social Media Research

4 - Online Hate Speech

Published online by Cambridge University Press:  24 August 2020

This chapter examines the state of the literature–including scientific research, legal scholarship, and policy reports–on online hate speech. In particular it explores ongoing debates and limitations in current approaches to defining and detecting online hate speech; provides an overview of what social media data and surveys can tell us about the producers, targets, and overall prevalence of harmful language; reviews empirical evidence of the offline consequences of online hate speech; and offers quantitative insights into what interventions might be most effective in combating harmful rhetoric online.

Introduction

Once relegated to the dark corners of the Internet, online hate speech has become increasingly visible on mainstream social media platforms. From targeted anti-Semitic attacks on Jewish journalists to reports of social media’s role in mobilizing ethnic violence in Myanmar and Sri Lanka, the offline consequences of online hate speech appear increasingly dire. Fearing that this harmful rhetoric is inciting violence and driving extremism, governments worldwide are passing regulation and pressuring social media companies to implement policies to stop the spread of online hate speech ( Reference Gagliardone, Pohjonen and Beyene Gagliardone et al. 2016 ).

However, these calls for action have rarely been motivated by comprehensive empirical evidence. Moreover, despite increased attention to online hate speech in the scientific literature, surprisingly little is known about the prevalence, causes, or consequences of different forms of harmful language across diverse platforms. Furthermore, researchers have only recently begun to examine the efficacy of approaches to countering online hate, and our understanding of the collateral costs of these interventions is especially limited.

This chapter examines the state of the literature – including scientific research, legal scholarship, and policy reports – on online hate speech. In particular, it explores ongoing debates and limitations in current approaches to defining and detecting online hate speech; provides an overview of what social media data and surveys can tell us about the producers, targets, and overall prevalence of harmful language; reviews empirical evidence of the offline consequences of online hate speech; and offers quantitative insights into what interventions might be most effective in combating harmful rhetoric online.

Defining Online Hate Speech

There is no single agreed on definition of hate speech – online or offline – and the topic has been hotly debated by academics, legal experts, and policymakers alike. Most commonly, hate speech is understood to be bias-motivated, hostile, and malicious language targeted at a person or group because of their actual or perceived innate characteristics ( Reference Cohen-Almagor Cohen-Almagor 2011 ; Reference Faris, Ashar, Gasser and Joo Faris et al. 2016 ). However, as Reference Sellars Sellars (2016) argues, “for all of the extensive literature about the causes, harms, and responses to hate speech, few scholars have endeavored to systematically define the term.”

A wide variety of content might or might not fit a definition of hate speech, depending on the context ( Reference Parekh, Herz and Molnar Parekh et al. 2012 ; Reference Sellars Sellars 2016 ). For example, while slurs and insults are easily identifiable, language containing epithets may not necessarily be considered hate speech by the speaker or target recipient ( Reference Delgado Delgado 1982 ). Conversely, more subtle language attacking an out-group, which can be harder for casual observers to identify, may have particularly damaging effects on individuals and group relations ( Reference Parekh, Herz and Molnar Parekh et al. 2012 ). This is especially true in the online sphere, where speech is rapidly evolving and can be highly specialized ( Reference Gagliardone, Gal, Alves and Martinez Gagliardone et al. 2015 ). The use of code words as stand-ins for racial slurs is also common in online communities, further complicating the definition of hate speech ( Reference Duarte, Llanso and Loup Duarte et al. 2018 ). For example, among members of the alt-right, journalists have documented the use of the term “googles” to refer to the n-word; “skypes” as an anti-Semitic slur; “yahoos” as a derogatory term for Hispanics; and “skittles” as an anti-Muslim term ( Reference Sonnad Sonnad 2016 ). Alt-right communities have also used steganography, such as triple brackets, to identify and harass Jews online ( Reference Fleishman and Smith Fleishman and Smith 2016 ). In this way, when defining hate speech – and online hate speech in particular – the well-known “I know it when I see it” classification famously applied to obscene content clearly falls short.

As a result, existing definitions of hate speech can be extremely broad or fairly narrow. At one end of the spectrum are definitions that capture a wide variety of speech that is directed against a specified or easily identifiable individual or group based on arbitrary or normatively irrelevant features ( Reference Parekh, Herz and Molnar Parekh et al. 2012 ). At the other end are definitions that require intended harm. The narrowest definitions imply that hate speech must be “dangerous speech” – language that is directly linked to the incitement of mass violence or physical harm against an out-group ( Reference Benesch Benesch 2013 ). This tension reflects the difficulty of developing a definition that adequately addresses the range of phenomena that might be considered hate speech, without losing valuable distinctions. Online hate speech can involve disparate instigators, targets, motives, and tactics. Sometimes perpetrators know those they attack, whereas others may galvanize anonymous online followers to target particular individuals. Speech that incites violence is distinct from speech that is “merely” offensive, and the use of harmful language by a single attacker is quite different from coordinated hate campaigns carried out by a digital mob ( Reference Sellars Sellars 2016 ). Recent work seeks to develop more comprehensive definitions and coding schemes for identifying hate speech that provide context and account for differences in severity and intent ( Reference Gagliardone, Pohjonen and Beyene Gagliardone et al. 2016 ; Reference Waseem, Hovy, Sahlgren and Knutsson Waseem and Hovy 2016 ; Reference Kennedy, Kogon and Coombs Kennedy et al. 2018 ; Reference Olteanu, Castillo, Boy and Varshney Olteanu et al. 2018 ). Yet despite these advances, there is still no consensus in the scientific literature on how to define online hate speech .

Legal definitions of hate speech are similarly murky. Governments are increasingly defining hate speech in their criminal codes in an attempt to directly regulate harmful rhetoric both on- and offline ( Reference Haraszti, Herz and Molnar Haraszti 2012 ). As with academic definitions, these range from the relatively broad, such as Canada’s characterization of hate speech as language that “willfully promotes hatred against any identifiable group,” to more narrow definitions, like the European Union’s framework, which defines hate speech as: “Public incitement to violence or hatred directed against a group of persons or a member of such group defined on the basis of race, [color], descent, religion or belief, or national or ethnic origin” and “publicly condoning, denying or grossly trivializing crimes of genocide, crimes against humanity, and war crimes [as defined in EU law], when the conduct is carried out in a manner likely to incite violence or hatred against such group or a member of such group” ( Reference Sellars Sellars 2016 ). In the Uniited Kingdom, it is a criminal offense to incite racial or religious hatred, and variations on this legislation – while unconstitutional in the United States – exist in the majority of developed democracies, including Australia, Denmark, France, Germany, India, South Africa, Sweden, and New Zealand ( Reference Howard Howard 2019 ), and in authoritarian contexts, particularly in the Arab World where laws banning online hate speech are often lumped together with laws countering extremism ( Reference Chetty and Alathur Chetty and Alathur 2018 ). Yet despite the existence of laws explicitly banning hate speech, how these laws should be enforced in practice, particularly in the digital age, is a subject of ongoing debate .

More recently, online platforms themselves have developed definitions of hate speech for the purpose of moderating user-generated content. For example, YouTube’s Community Guidelines “hateful content” section states “we don’t support content that promotes or condones violence against individuals or groups based on race or ethnic origin, religion, disability, gender, age, nationality, veteran status, or sexual orientation/gender identity, or whose primary purpose is inciting hatred on the basis of these core characteristics” ( YouTube 2018 ). Similarly, Twitter’s terms of service state that the company prohibits “hateful conduct” including “promot[ing] violence against or directly attack[ing] or threaten[ing] other people on the basis of race, ethnicity, national origin, sexual orientation, gender, gender identity, religious affiliation, age, disability or disease.” The company also emphasizes that it does not allow accounts whose “primary purpose is inciting harm towards others on the basis of these categories” ( Twitter 2018 ). Facebook’s definition of hate speech does not contain the incitement to violence language employed by Twitter and YouTube, instead identifying hate speech as “content that directly attacks people based on their race; ethnicity; national origin; religious affiliation; sexual orientation; sex, gender, or gender identity; or serious disabilities or diseases” ( Facebook 2018 ).

Together, this absence of clear and consistent definitions of hate speech in academic research, legal scholarship, and among actors attempting to govern online spaces has meant that despite extensive research, and well-documented policy interventions, our knowledge of the causes, consequences, and effective means of combating online hate speech remains somewhat clouded by definitional ambiguity .

Detecting Online Hate Speech

Just as there is no clear consensus on the definition of hate speech, there is no consensus with regard to the most effective way to detect it across diverse platforms. The majority of automated approaches to identifying hate speech begin with a binary classification task in which researchers are concerned with coding a document as “hate speech or not,” though multiclass approaches have also been used ( Reference Davidson, Warmsley, Macy and Weber Davidson et al. 2017 ).

Automated hate speech detection tends to rely on natural language processing or text mining strategies ( Reference Fortuna and Nunes Fortuna and Nunes 2018 ) . The simplest of these approaches are dictionary-based methods, which involve developing a list of words that are searched and counted in a text. Dictionary-based approaches generally use content words – including insults and slurs – to identify hate speech ( Reference Dinakar, Reichart and Lieberman Dinakar et al. 2011 ; Reference Dadvar, de Jong, Ordelman and Trieschnigg Dadvar et al. 2012 ; Reference Liu and Forss Liu and Forss 2015 ; Reference Isbister, Sahlgren, Kaati, Obaidi and Akrami Isbister et al. 2018 ). These methods can also involve normalizing or taking the total number of words in each text into consideration ( Reference Dadvar, de Jong, Ordelman and Trieschnigg Dadvar et al. 2012 ). Recognizing that online hate speech may obscure offensive words using accidental or intentional misspellings, some researchers have used distance metrics, such as the minimum number of edits necessary to transform one term into another, to augment their dictionary-based methods ( Reference Warner, Hirschberg, Owsley Sood, Nagarajan and Gamon Warner and Hirschberg 2012 ). Furthermore, given that code words may be used to avoid detection of hateful terms, other researchers have included known anti–out-group code words in their dictionaries ( Reference Magu, Joshi and Luo Magu et al. 2017 ).

Beyond pure dictionary-based methods, most state-of-the-art hate speech detection techniques involve supervised text classification tasks. These approaches, such as using Naive Bayes classifiers, linear support vector machines (SVM), decision trees, or random forest models, often rely on “bag-of-words” and “n-gram” techniques. In the bag-of-words method , a corpus is created based on words that appear in a training dataset, instead of a predefined dictionary. The frequencies of words appearing in text, which has been manually annotated as “hate speech or not,” are then used as features to train a classifier ( Reference Greevy and Smeaton Greevy and Smeaton 2004 ; Reference Kwok and Wang Kwok and Wang 2013 ; Reference Burnap and Williams Burnap and Williams 2016 ). To avoid misclassification, if words are used in different contexts or spelled incorrectly, some researchers use n-grams , a similar approach to bag-of-words, which combines sequential words into bigrams, trigrams, or lists of length “n” ( Reference Burnap and Williams Burnap and Williams 2016 ; Reference Waseem, Hovy, Sahlgren and Knutsson Waseem and Hovy 2016 ; Reference Badjatiya, Gupta, Gupta and Varma Badjatiya et al. 2017 ; Reference Davidson, Warmsley, Macy and Weber Davidson et al. 2017 ). More recent work has leveraged these approaches to improve the accuracy of dictionary-based methods – removing false positives by identifying which tweets containing slurs should indeed be classified as hate speech ( Reference Siegel, Nitikin and Barberá Siegel et al. 2020 ). Rule-based approaches and theme-based grammatical patterns, which incorporate sentence structure, have also been used ( Reference Fortuna and Nunes Fortuna and Nunes 2018 ).

Other researchers have identified hate speech using topic modeling, aiming to identify posts belonging to a defined topic such as race or religion ( Reference Agarwal and Sureka Agarwal and Sureka 2017 ). Still others have incorporated sentiment into their analysis, with the assumption that hate speech is likely to be negative in tone ( Reference Liu and Forss Liu and Forss 2014 ; Reference Gitari, Zuping, Damien and Long Gitari et al. 2015 ; Reference Davidson, Warmsley, Macy and Weber Davidson et al. 2017 ; Reference Del Vigna, Cimino, Dell’Orletta, Petrocchi, Tesconi, Armando, Baldoni and Focardi Del Vigna et al. 2017 ). Word embedding or vector representations of text techniques including doc2vec, paragraph2vec, and FastText have also been used ( Reference Djuric, Zhou, Morris, Grbovic, Radosavljevic, Bhamidipati, Joinson, McKenna, Postmes and Reips Djuric et al. 2015 ; Reference Schmidt and Wiegand Schmidt and Wiegand 2017 ; Reference Siegel and Badaan Siegel et al. 2020 ), and deep learning techniques employing neural networks have become more common for both text classification and sentiment analysis related to detecting hate speech ( Reference Yuan, Xintao and Xiang Yuan et al. 2016 ; Reference Zhang, Robinson, Tepper, Gangemi, Navigli and Vidal Zhang et al. 2018 , Reference Al-Makhadmeh and Tolba Al-Makhadmeh and Tolba 2020 ).

Recognizing that these techniques may not be well-suited to identifying subtle or indirect forms of online hate, researchers have also employed more theoretically motivated approaches. For example, Reference Burnap and Williams Burnap and Williams (2016) and Reference Warner, Hirschberg, Owsley Sood, Nagarajan and Gamon ElSherief, Kulkarni et al. (2018) incorporate the concept of othering or “us vs. them ” language into their measure of hate speech. They find that hate speech often uses third-person pronouns, including expressions like “send them all home.” Other studies have incorporated declarations of in-group superiority – in addition to attacks directed at out-groups – into their measures ( Reference Beauchamp, Panaitiu and Piston Warner and Hirschberg 2012 ). Another approach involves accounting for common anti–out-group stereotypes. For example, anti-Hispanic speech might make reference to border crossing, or anti-Semitic language might refer to banking, money, or the media ( Reference Alorainy, Burnap, Liu and Williams Alorainy et al. 2018 ). Additional work has distinguished between hate speech directed at a group (generalized hate speech) and hate speech directed at individuals (directed hate speech) to capture important nuances in the targets of online hate speech ( Reference ElSherief, Kulkarni, Nguyen, Wang and Belding ElSherief, Kulkarni et al. 2018 ). Beyond relying on textual features, researchers have also incorporated user characteristics, including network features and friend/follower counts to improve the accuracy of hate speech detection ( Reference Unsvåg and Gambäck Unsvåg and Gambäck 2018 ).

Another more recent set of approaches leverages large pre-classified datasets from online platforms to detect online hate speech. These include the bag-of-communities technique (Chandrasekharan, Samory et al. 2017), which computes the similarity of a post to the language used in nine other known hateful communities from 4chan, Reddit, Voat, and MetaFilter. Similar techniques have been employed by Reference Saleem, Dillon, Benesch and Ruths Saleem et al. (2017) and Reference Siegel, Nitikin and Barberá Siegel et al. (2020) , using data from well-known hateful subreddits to classify hate speech on Twitter. An advantage of these methods is that they are not hindered by low intercoder reliability that can be found in training datasets or by the fact that rapidly evolving speech patterns online can make it difficult to use the same training data over time ( Reference Waseem, Bamman, Seza Doğruöz and Eisenstein Waseem 2016 ) .

Despite these major advances in the automatic detection of online hate speech, existing methods largely have not been tested across multiple platforms or diverse types of hate speech. Owing to ease of data collection, most existing studies have relied on Twitter data. While other works have incorporated data from Reddit, YouTube, Facebook, Whisper, Tumblr, Myspace, Gab, the comment sections of websites, and blogs, these are relatively rare ( Reference Fortuna and Nunes Fortuna and Nunes 2018 ; Reference Mathew, Saha and Tharad Mathew, Dutt et al. 2019 ). Additionally, the vast majority of studies examine English-language content, though some researchers have developed methods to detect hate speech in other languages . These include empirical examinations of hate speech in Amharic ( Reference Mossie and Wang Mossie and Wang 2018 ), Arabic ( Reference Siegel Siegel 2015 ; Reference De Smedt, De Pauw and Van Ostaeyen De Smedt et al. 2018 ; Reference Siegel, Tucker, Nagler and Bonneau Siegel et al. 2018 , Reference Albadi, Kurdi and Mishra Albadi et al. 2019 , Reference Chowdhury, Didolkar, Sawhney, Shah, Alva-Manchego, Choi and Khashabi Chowdhury et al. 2019 ), Dutch ( Reference Van Hee, Lefever and Verhoeven Van Hee et al. 2015 ), German ( Reference Ross, Rist, Carbonell, Cabrera, Kurowsky and Wojatzki Ross et al. 2017 ), Hindi ( Reference Santosh and Aravind Santosh and Aravind 2019 ), Indonesian ( Reference Aulia and Budi Aulia and Budi 2019 ), Italian ( Reference Lingiardi, Carone, Semeraro, Musto, D’Amico and Brena Lingiardi et al. 2019 ), Korean ( Reference Kang, Kim, Park and Cha Kang et al. 2018 ), Polish ( Reference Czapla, Gugger, Howard and Kardas Czapla et al. 2019 ), Romanian ( Reference Meza Meza 2016 ), and Spanish ( Reference Basile, Bosco, Fersini, May, Shutova, Herbelot, Zhu, Apidianaki and Mohammad Basile et al. 2019 ). Crowd-sourced multilingual dictionaries of online hate speech including Hatebase, the Racial Slur Database, and HateTrack have also been developed, demonstrating promising avenues for future work ( Reference ElSherief, Kulkarni, Nguyen, Wang and Belding ElSherief, Kulkarni et al. 2018 , Reference Siapera, Moreo and Zhou Siapera et al. 2018 ). Yet approaches to automated hate speech detection that are designed to scale across multiple languages are quite difficult to develop, and more work is needed in this area.

Additionally, the majority of studies of online hate speech seek to detect all types of hate speech at once, or “general hate speech” ( Reference Fortuna and Nunes Fortuna and Nunes 2018 ). However, other works have examined specific types of harmful language, including jihadist hate speech ( Reference De Smedt, De Pauw and Van Ostaeyen De Smedt et al. 2018 ), sectarian hate speech ( Reference Siegel Siegel 2015 ; Reference Siegel, Tucker, Nagler and Bonneau Siegel et al. 2018 ), anti-Muslim hate speech ( Reference Olteanu, Castillo, Boy and Varshney Olteanu et al. 2018 ), anti-black hate speech ( Reference Kwok and Wang Kwok and Wang 2013 ), misogynistic hate speech ( Reference Citron Citron 2011 ), and anti-immigrant hate speech ( Reference Ross, Rist, Carbonell, Cabrera, Kurowsky and Wojatzki Ross et al. 2017 ). Recent work has also explored differences in types of hate speech, comparing hate speech targeting diverse out-groups and distinguishing between more and less severe types of hate speech ( Reference Beauchamp, Panaitiu and Piston Beauchamp et al. 2018 ; Reference Saha, Chandrasekharan and De Choudhury Saha et al. 2019 ; Reference Siegel, Nitikin and Barberá Siegel et al. 2019 ).

Producers of Online Hate Speech

While extensive research has explored organized hate groups’ use of online hate speech, less is known about the actors in informal communities dedicated to producing harmful content, or the accounts that produce hate speech on mainstream platforms. Moreover, no empirical work has systematically examined how these actors interact within and across platforms .

Organized hate groups established an online presence shortly after the invention of the Internet ( Reference Bowman-Grieve Bowman-Grieve 2009 ) and have proliferated over time. More than a decade of primarily qualitative research has demonstrated that organized hate groups use the Internet to disseminate hate speech on their official websites ( Reference Adams and Roscigno Adams and Roscigno 2005 ; Reference Chau and Xu Chau and Xu 2007 ; Reference Douglas Douglas 2007 ; Reference Flores-Yeffal, Vidales and Plemons Flores-Yeffal et al. 2011 ; Reference Castle Castle 2012 ; Reference Parenti, Nixon, Rawal and Mercea Parenti 2013 ). This includes the use of interactive forums ( Reference Holtz and Wagner Holtz and Wagner 2009 ) such as chat boards and video games ( Reference Selepak Selepak 2010 ). Hate groups use these channels both to broaden their reach and to target specific audiences. For example, the explicitly racist video games that originate on far-right extremist websites are designed to appeal to ardent supporters and potential members alike, especially youth audiences ( Reference Selepak Selepak 2010 ). Along these lines, hate groups have used the Internet to recruit new members and reinforce group identity ( Reference Chau and Xu Chau and Xu 2007 ; Reference Parenti, Nixon, Rawal and Mercea Parenti 2013 ; Reference Weaver Weaver 2013 ). Online platforms are also especially well-suited to tailoring messages to specific groups or individuals ( Reference Castle Castle 2012 ). By providing efficient ways to reach new audiences and disseminate hateful language, the Internet enables hate groups to be well represented in the digital realm, fostering a sense of community among their members, and attracting the attention of journalists and everyday citizens alike ( Reference Bowman-Grieve Bowman-Grieve 2009 ; Reference McNamee, Peterson and Peña McNamee et al. 2010 ) .

In addition to the official websites of organized hate groups, the number of sites dedicated to producing hateful content operated by informal groups and individuals has also increased over time ( Reference Potok Potok 2015 ). These include explicitly racist, misogynistic, or otherwise discriminatory pages, channels, or communities on mainstream social networking platforms like Facebook, Twitter, and YouTube, as well as forums on Reddit 4chan, and 8chan, listserves, internet chat communities, discussion forums, and blogs designed to disseminate hateful rhetoric ( Reference Douglas Douglas 2007 ; Reference Marwick Marwick 2017 ). These range from fake Facebook profiles designed to incite violence against minorities ( Reference Farkas and Neumayer Farkas and Neumayer 2017 ) to infamous (now banned) Reddit forums like /CoonTown and /fatpeoplehate (Chandrasekharan, Pavalanathan et al. 2017). Well-known white nationalists and hateful accounts have also operated openly on mainstream social media platforms. For example, Richard Spencer, who organized the “Unite the Right” alt-right Charlottesville rally, has more than 75,000 followers and was verified by Twitter up until November 2017, when he was stripped of his verified status. Twitter accounts such as @SageGang and @WhiteGenocide frequently tweet violent racist and anti-Semitic language ( Reference Daniels Daniels 2017 ).

However, such concentrations of hateful content are sometimes banned and removed from particular platforms. As a result, these communities often disappear and resurface in new forms. For example, in 2011, 4chan’s founder deleted the news board (/n/) due to racist comments and created /pol/ as a replacement forum for political discussion.

4chan’s /pol/ board quickly became a home for particularly hateful speech – even by 4chan standards ( Reference Hine, Onaolapo and De Cristofaro Hine et al. 2016 ). Similarly, banned subreddits like Coontown have moved to Voat, a platform with no regulations with regard to hate speech (Chandrasekharan, Pavalanathan et al. 2017). While survey data and ethnographic work suggests that users of 4chan and Redditt are overwhelmingly young, white, and male ( Reference Daniels Daniels 2017 ; Reference Costello and Hawdon Costello and Hawdon 2018 ), because of the anonymous nature of these sites we do not know very much about the users that produce the most hate speech. In particular, we do not know the degree to which their rhetoric represents their actual beliefs or is simply trolling or attention-seeking behavior, which is quite common in these communities ( Reference Phillips Phillips 2015 ).

Outside of these official and unofficial pages and forums dedicated to hateful content, hate speech is also prevalent in general online discussions across a variety of popular platforms, including Facebook, YouTube, Myspace, Tumblr, Whisper, and Yik Yak ( Reference Black, Mezzina and Thompson Black et al. 2016 ; Reference Fortuna and Nunes Fortuna and Nunes 2018 ). While little is known about the specific individuals that produce hate speech on these mainstream platforms, recent work has begun to measure and characterize their behavior. Examining the trajectory of producers of hate speech over time, Reference Beauchamp, Panaitiu and Piston Beauchamp et al. (2018) find that producers of misogynistic and racist hate speech on Twitter tend to start out expressing “softer,” more indirect hateful language and later graduate to producing more virulent hate. The authors posit that this may be due to gradually decreasing levels of social stigma as these users find themselves in increasingly extreme social networks. Reference ElSherief, Nilizadeh, Nguyen, Vigna and Belding ElSherief, Nilizadeh et al. (2018 ) find on Twitter that accounts that instigate hate speech tend to be new, very active, and express lower emotional awareness and higher anger and immoderation in the content of their tweets, compared to other Twitter users who did not produce such content. Similarly, using a manually annotated dataset of about 5,000 “hateful users,” Reference Ribeiro, Calais, Santos, Almeida and Meira Ribeiro et al. (2018 ) find that hateful users tweet more frequently, follow more people each day, and their accounts are more short-lived and recent. They also find that, although hateful users tend to have fewer followers, they are very densely connected in retweet networks. Hateful users are seventy-one times more likely to retweet other hateful users and suspended users are eleven times more likely to retweet other suspended users, compared to non-hateful users. Comparing users that produce hate speech to those that do not on Gab, Reference Mathew, Saha and Tharad Mathew, Dutt et al. (2019 ) also find that hateful users are densely connected to one another. As a result, they argue, content generated by hateful users tends to spread faster, farther, and reach a much wider audience as compared to the content generated by users that do not produce hate speech.

Such behavior may contribute to the overall visibility of hate speech on mainstream online platforms. For example, on Twitter, although tweets containing hate speech have lower numbers of replies and likes than non-hateful tweets, they contain a similar number of retweets ( Reference Klubicka and Fernandez Klubicka and Fernandez 2018 ). The highly networked structure of hateful Twitter users also dovetails with qualitative evidence suggesting that people are mobilized on explicitly hateful subreddits or communities like the /pol/ board on 4chan to engage in coordinated racist or sexist attacks on Twitter ( Reference Daniels Daniels 2017 ).

Studying the network structure of users who produce online hate speech, Reference Magdy, Darwish, Abokhodair, Rahimi and Baldwin Magdy et al. (2016 ) find that they can predict the likelihood that Twitter users tweet anti-Muslim messages after the 2015 Paris attacks with high levels of precision and accuracy based on their Twitter networks, even if they have never mentioned Muslims or Islam in their previous tweets. Twitter users who follow conservative media outlets, Republican primary candidates, evangelical Christian preachers, and accounts discussing foreign policy issues were significantly more likely to tweet anti-Muslim content following the Paris attacks than those that did not.

In one of the few existing surveys of social media users exploring the use of hate speech, Reference Costello and Hawdon Costello and Hawdon (2018 ) find that people who spend more time on Reddit and Tumblr report disseminating more hate speech online. Moreover, individuals who are close to an online community, or spend more time in communities where hate speech is common, are more inclined to produce hate material. Counter to their expectations, they find that spending more time online in general, however, is not associated with the production of hate and there is no association between the use of first-person shooter games and producing hate material online.

As with pages and forums explicitly dedicated to online hate speech, individual producers of online hate speech have increasingly been banned from Twitter and other mainstream platforms. While many of these users simply create new accounts after they have been suspended, others have moved to more specialized platforms where they can produce hate more freely. For example, in August 2016, the social network Gab was created as an alternative to Twitter. The platform stated that it was dedicated to “people and free speech first,” courting users banned or suspended from other social networks ( Reference Marwick and Lewis Marwick and Lewis 2017 ). Reference Zannettou, Bradlyn and De Cristofaro Zannettou et al. (2018) found that Gab is mainly used for the dissemination and discussion of news and world events, and that it predominantly attracts alt-right users, conspiracy theorists, and trolls. The authors find that hate speech is much more common on Gab than Twitter but less common on Gab than on 4chan’s /pol/ board . Similarly, Reference Lima, Reis and Melo Lima et al. (2018) found that Gab generally hosts banned users from other social networks, many of whom were banned due to their use of hate speech and extremist content .

Targets of Online Hate Speech

One of the few areas of consensus in defining hate speech, which separates it from other forms of harmful speech, is that hate speech targets groups or individuals as they relate to a group ( Reference Sellars Sellars 2016 ). A small body of literature has explicitly analyzed the targets of online hate speech. Studying the targets of online hate speech on Whisper (an anonymous online platform) and Twitter using a sentence-structure–based algorithm, Reference Silva, Mondal, Correa, Benevenuto and Weber Silva et al. (2016 ) find that targeted individuals on both platforms are primarily attacked on the basis of their ethnicity, physical characteristics, sexual orientation, class, or gender. Survey research suggests that victims of online hate speech tend to engage in high levels of online activity ( Reference Hawdon, Oksanen, Rasänen, Hawdon, Ryan and Lucht Hawdon et al. 2014 ), have less online anonymity, and engage in more online antagonism ( Reference Costello, Rukus and Hawdon Costello, Rukus, and Hawdon 2018 ). Examining the targets of hate speech on Twitter, Reference ElSherief, Nilizadeh, Nguyen, Vigna and Belding ElSherief, Nilizadeh et al. (2018 ) find that those targeted by hate speech were 60 percent more likely to be verified than the accounts of instigators and 40 percent more likely to be verified than general users, respectively. This suggests that more visible Twitter users (with more followers, retweets, and lists) are more likely to become targets of hate.

Along these lines, recent qualitative research suggests that journalists, politicians, artists, bloggers, and other public figures have been disproportionately targeted by hate speech ( Reference Isbister, Sahlgren, Kaati, Obaidi and Akrami Isbister et al. 2018 ). For example, when the all-female reboot of Ghostbusters was released in July 2016, white supremacist Milo Yiannopoulos instigated a Twitter storm following the publication of his negative movie review on Breitbart. White supremacists began to bombard African American actress Leslie Jones’s timeline with sexist and racist slurs and hateful memes, including rape and death threats. When the abuse escalated as Yiannopoulos began directly tweeting at Jones and egging on his followers, Jones left Twitter. After public pressure convinced the company to intervene, Yiannopoulos was banned from Twitter and Jones returned ( Reference Isaac Isaac 2016 ). Similarly, as the author Mikki Kendall describes, “I was going to leave Twitter at one point. It just wasn’t usable for me. I would log on and have 2,500 negative comments. One guy who seemed to have an inexhaustible energy would Photoshop my image on top of lynching pictures and tell me I should be ‘raped by dogs,’ that kind of thing.” Kendall was also doxxed – her address was made public online – and she received a picture of her and her family in a photo that “looked like it had been sighted through a rifle” ( Reference Isaac Isaac 2016 ).

In June 2016, several highly visible Jewish journalists began to report a barrage of online hate that involved steganography – triple parentheses placed around their names like (((this))) ( Reference Fleishman and Smith Fleishman and Smith 2016 ). As a result, the Anti-Defamation League (ADL) added the triple parentheses to their database of hateful symbols. This “digital equivalent of a yellow star” was intended to identify Jews as targets for harassment online ( Reference Gross Gross 2017 ). For example, Jonathan Weisman of the New York Times left Twitter after being subjected to anti-Semitic harassment beginning with a Twitter account known as @CyberTrump, which escalated to a barrage of hateful Twitter activity, voicemails, and emails containing slurs and violent imagery ( Reference Gross Gross 2017 ).

As these examples suggest, online hate speech may be most visible in coordinated attacks detecting this behavior ( Reference Mariconti, Suarez-Tangil and Blackburn Mariconti et al. 2018 ). Such attacks draw a great deal of attention both online and through traditional media outlets, making these strategic targets useful for both extremists and trolls seeking to reach a broader audience and elevate their messages. Such coordinated harassment campaigns allow groups of anonymous individuals to work together to bombard particular users with harmful content again and again ( Reference Chess and Shaw Chess and Shaw 2015 ; Reference Chatzakou, Kourtellis, Blackburn, De Cristofaro, Stringhini and Vakali Chatzakou et al. 2017 ). One manifestation of this behavior is known as raiding, when ad hoc digital mobs organize and orchestrate attacks aimed to disrupt other platforms and undermine users who advocate issues and policies with which they disagree ( Reference Hine, Onaolapo and De Cristofaro Hine et al. 2016 ; Reference Kumar, Hamilton, Leskovec and Jurafsky Kumar et al. 2018 ; Reference Mariconti, Suarez-Tangil and Blackburn Mariconti et al. 2018 ). However, while raiding receives a great deal of media attention, we have little understanding of how common or pervasive these attacks are or on what platforms they most commonly occur .

Prevalence of Online Hate Speech

While a great deal of research has been devoted to defining and detecting online hate speech, we know surprisingly little about the popularity of online hate speech on either mainstream or fringe platforms, or how the volume of hate speech shifts in response to events on the ground. Social media platforms have increased the visibility of hate speech, prompting journalists and academics alike to assert that hate speech is on the rise. As a result, there is a tendency to characterize entire mainstream social media platforms as bastions of online hate, without using empirical evidence to evaluate how pervasive the phenomenon truly is. For example, after becoming the target of a hateful online attack, Atlantic editor Jeffrey Goldberg called Twitter “a cesspool for anti-Semites, homophobes, and racists” ( Reference Lizza Lizza 2016 ). While any online hate speech is of course problematic, suggesting that a platform used by more than a quarter of Americans and millions more around the globe is dominated by such speech is misleading and potentially problematic – particularly in countries where civil and political liberties are already under threat and social media provides a valuable outlet for opposition voices ( Reference Gagliardone, Pohjonen and Beyene Gagliardone et al. 2016 ).

With regard to empirical evidence, a small handful of studies have begun to systematically evaluate the prevalence of hate speech on online platforms, though more work is needed. Analyzing the popularity of hate speech in more than 750 million political tweets and in 400 million tweets sent by a random sample of American Twitter users between June 2015 and June 2017, Reference Siegel, Nitikin and Barberá Siegel et al. (2020) find that, even on the most prolific days, only a fraction of a percentage of tweets in the American Twittersphere contain hate speech. Similarly, studying the popularity of hate speech on Ethiopian Facebook pages, Reference Gagliardone, Pohjonen and Beyene Gagliardone et al. (2016) find that only 0.4 percent of statements in their representative sample were classified as hate speech, and 0.3 percent of tweets were classified as dangerous speech, which directly or indirectly calls for violence against a particular group.

While these studies suggest that online hate speech is a relatively rare phenomenon, cross-national survey research suggests that large numbers of individuals have nonetheless been incidentally exposed to online hate speech. In a cross-national survey of internet users between the ages of fifteen and thirty, 53 percent of American respondents report being exposed to hate material online, while 48 percent of Finns, 39 percent of Brits, and 31 percent of Germans report exposure. Using online social networks frequently and visiting “dangerous sites” are two of the strongest predictors of such exposure ( Reference Hawdon, Oksanen and Rasänen Hawdon et al. 2017 ). Perhaps explaining the discrepancy between empirical findings that hate speech is quite rare on mainstream platforms and high rates of self-reported exposure, Reference Kaakinen, Räsänen, Näsi, Minkkinen, Keipi and Oksanen Kaakinen et al. (2018) find that, while hateful content is rarely produced, it is more visible than other forms of content. Hate speech is also more common in particular online demographic communities than others. For example, Reference Saha, Chandrasekharan and De Choudhury Saha et al. (2019) find that hate speech is more prevalent in subreddits associated with particular colleges and universities than popular subreddits that were not associated with colleges or universities .

In addition to exploring the prevalence of online hate speech, recent work has investigated how offline events may drive upticks in the popularity of such rhetoric. One avenue of research explores the impact of violent offline events on various types of hate speech. For example, studying the causal effect of terror attacks in Western countries on the use of hateful language on Reddit and Twitter, Reference Olteanu, Castillo, Boy and Varshney Olteanu et al. (2018 ) find that episodes of extremist violence lead to an increase in online hate speech, particularly messages directly advocating violence, on both platforms. The authors argue that this provides evidence that theoretical arguments regarding the feedback loop between offline violence and online hate speech are – unfortunately – well-founded. This finding supports other research suggesting hate speech and hate crimes tend to increase after “trigger” events, which can be local, national, or international, and often drive negative sentiments toward groups associated with suspected perpetrators of violence ( Reference Awan and Zempi Awan and Zempi 2015 ).

Similarly, seeking to assess the impact of diverse episodes of sectarian violence on the popularity of anti-Shia hate speech in the Saudi Twittersphere, Reference Siegel, Tucker, Nagler and Bonneau Siegel et al. (2018 ) find that both violent events abroad and domestic terror attacks on Shia mosques produce significant upticks in the popularity of anti-Shia language in the Saudi Twittersphere. Providing further insight into the mechanisms by which offline violent events lead to increases in the use of online hate speech, the authors demonstrate that, while clerics and other elite actors both instigate and spread derogatory rhetoric in the aftermath of foreign episodes of sectarian violence – producing the largest upticks in anti-Shia language – they are less likely to do so following domestic mosque bombings.

Exploring the effect of political – rather than violent – events on the popularity of online hate speech, Reference Siegel, Nitikin and Barberá Siegel et al. (2020) find, contrary to the popular journalistic narrative, that online hate speech did not increase either over the course of Donald Trump’s 2016 campaign or in the aftermath of his unexpected election. Using a dataset of more than 1 billion tweets, their results are robust whether they detect hate speech using a machine-learning–augmented dictionary-based approach or a community-based detection algorithm comparing the similarity of daily Twitter data to the content produced on hateful subreddits over time. Instead, hate speech was “bursty” – spiking in the aftermath of particular events and re-equilibrating shortly afterward. Similarly, Reference Faris, Ashar, Gasser and Joo Faris et al. (2016) demonstrate spikes in online harmful speech are often linked to political events, whereas Reference Saleem, Dillon, Benesch and Ruths Saleem et al. (2017) find that hate speech rose in the aftermath of events that triggered strong emotional responses like the Baltimore protests and the US Supreme Court decision on same-sex marriage.

Together, these studies demonstrate the importance of examining both the prevalence and the dynamics of online hate speech systematically over time and using large representative samples. More work is needed to better understand how different types of online hate speech gain traction in diverse global contexts and how their relative popularity shifts on both mainstream and specialized social media platforms over time .

Offline Consequences of Online Hate Speech

Systematically measuring the impact of online hate speech is challenging ( Reference Sellars Sellars 2016 ), but a diverse body of research suggests that online hate speech has serious offline consequences both for individuals and for groups. Surveys of internet users indicate that exposure to online hate speech may cause fear ( Reference Hinduja and Patchin Hinduja and Patchin 2007 ), particularly in historically marginalized or disadvantaged populations. Other work suggests that such exposure may push people to withdraw from public debate both on- and offline, therefore harming free speech and civic engagement ( Reference Henson, Reyns and Fisher Henson et al. 2013 ). Indeed, observational data indicate that exposure to hate speech may have many of the same consequences as being targeted by hate crimes, including psychological trauma and communal fear ( Reference Gerstenfeld Gerstenfeld 2017 ). Along these lines, human rights groups have argued that failure to monitor and counter hate speech online can reinforce the subordination of targeted minorities, making them vulnerable to attacks, while making majority populations more indifferent to such hatred ( Reference Izsak Izsak 2015 ). That being said, recent work demonstrates that interpretations of hate speech – what is considered hateful content as well as ratings of the intensity of content – differ widely by country ( Reference Salminen, Almerekhi, Kamel, Jung and Jansen Salminen et al. 2019 ), and men and political conservatives tend to find hate material less disturbing than women, political moderates, and liberals ( Reference Costello, Hawdon, Bernatzky and Mendes Costello et al. 2019 ).

On the individual level, qualitative research suggests that Muslims living in the West who are targeted by online hate speech fear that online threats may materialize offline ( Reference Awan and Zempi Awan and Zempi 2015 ). Furthermore, surveys of adolescent internet users have found that large numbers of African American respondents have experienced individual or personal discrimination online, and such exposure is associated with depression and anxiety, controlling for measures of offline discrimination ( Reference Tynes, Giang, Williams and Thompson Tynes et al. 2008 ). In studying the differential effects of exposure to online hate speech, Reference Tynes and Markoe Tynes and Markoe (2010) find from a survey experiment conducted on college-age internet users that African American participants were most bothered by racist content (images) on social networking sites, whereas European Americans – especially those who held “color-blind” attitudes – were more likely to be “not bothered” by those images. Similarly, individuals exposed to hate speech on university-affiliated subreddits exhibited higher levels of stress than those who were not ( Reference Saha, Chandrasekharan and De Choudhury Saha et al. 2019) . Survey data suggest that youth who have been exposed to online hate speech have weaker attachment to family and report higher levels of unhappiness, though this relationship is not necessarily causal ( Reference Hawdon, Oksanen, Rasänen, Hawdon, Ryan and Lucht Hawdon et al. 2014 ). Exposure to hate speech online is also associated with an avoidance of political talk over time ( Reference Barnidge, Kim, Sherrill, Luknar and Zhang Barnidge et al. 2019 ). At the group level, online hate speech has fueled intergroup tensions in a variety of contexts, sometimes leading to violent clashes and undermining social cohesion ( Reference Izsak Izsak 2015 ). For example, Facebook has come under fire for its role in mobilizing anti-Muslim mob violence in Sri Lanka and for inciting violence against the Rohingya people in Myanmar ( Reference Vindu, Kumar and Frenkel Vindu, Kumar, and Frenkel 2018 ). Elucidating the mechanisms by which exposure to hate speech drives intergroup tension, survey data and experimental evidence from Poland suggest that frequent and repetitive exposure to hate speech leads to desensitization to hateful content, lower evaluations of populations targeted by hate speech, and greater distancing – resulting in higher levels of anti–out-group prejudice ( Reference Soral, Bilewicz and Winiewski Soral et al. 2018 ).

A diverse body of literature suggests that hate speech may foster an environment in which bias-motivated violence is encouraged either subtly or explicitly ( Reference Herek, Berrill and Berrill Herek et al. 1992 ; Reference Greenawalt Greenawalt 1996 ; Reference Calvert Calvert 1997 ; Reference Tsesis Tsesis 2002 ; Reference Matsuda Matsuda 2018 ). Intergroup conflict is more likely to occur and spread when individuals and groups have the opportunity to publicly express shared grievances and coordinate collective action ( Reference Weidmann Weidmann 2009 ; Reference Cederman, Wimmer and Min Cederman et al. 2010 ). Digital technology is thought to reduce barriers to collective action among members of the same ethnic or religious group by improving access to information about one another’s preferences. This is thought to increase the likelihood of intergroup conflict and accelerate its spread across borders ( Reference Pierskalla and Hollenbach Pierskalla and Hollenbach 2013 ; Reference Bailard Bailard 2015 ; Reference Weidmann Weidmann 2015 ).

Moreover, while hate speech is just one of many factors that interact to mobilize ethnic conflict, it plays a powerful role in intensifying feelings of mass hate ( Reference Vollhardt, Coutin, Staub, Weiss and Deflander Vollhardt et al. 2007 ; Reference Gagliardone, Patel and Pohjonen Gagliardone et al. 2014 ). This may be particularly true in the online sphere, where the anonymity of online communication can drive people to express more hateful opinions than they might otherwise ( Reference Cohen-Almagor Cohen-Almagor 2017 ). As individuals come to believe that “normal” rules of social conduct do not apply ( Reference Citron Citron 2014 ; Reference Delgado and Stefancic Delgado and Stefancic 2014 ), intergroup tensions are exacerbated. Along these lines, online hate speech places a physical distance between speaker and audience, emboldening individuals to express themselves without repercussions ( Reference Citron Citron 2014 ). Perhaps more importantly, online social networks create the opportunity for individuals to engage with like-minded others that might otherwise never connect or be aware of one another’s existence ( Reference Posner, Stone and Bollinger Posner 2001 ). Recognizing the importance of online hate speech as an early warning sign of ethnic violence, databases of multilingual hate speech are increasingly used by governments, policymakers, and NGOs to detect and predict political instability, violence, and even genocide ( Reference Gagliardone, Patel and Pohjonen Gagliardone et al. 2014 ; Reference Tuckwood Tuckwood 2014 ; Reference Gitari, Zuping, Damien and Long Gitari et al. 2015 ).

Many have argued that there is a direct connection between online hate and hate crimes, and perpetrators of offline violence often cite the role online communities have played in driving them to action ( Reference Citron Citron 2014 ; Reference Cohen-Almagor Cohen-Almagor 2017 ; Reference Gerstenfeld Gerstenfeld 2017 ). For example, on June 17, 2015, twenty-one-year-old Dylann Roof entered the Emanuel African Methodist Episcopal Church and murdered nine people. In his manifesto, Roof wrote that he drew his first racist inspiration from the Council of Conservative Citizens (CCC) website ( Reference Cohen-Almagor Cohen-Almagor 2018 ). Similarly, the perpetrator of the 2019 Pittsburgh synagogue attack was allegedly radicalized on Gab, and the perpetrator of the 2019 New Zealand mosque shootings was reportedly radicalized on online platforms and sought to broadcast his attack on YouTube.

While it is very difficult to causally examine the link between online hate speech and hate crimes, recent empirical work has attempted to do so. This work builds off of a larger literature exploring how the use of hate speech through traditional media platforms can be used to trigger violent outbursts or ethnic hatred. This includes work exploring the effect of hate radio on levels of violence during the Rwandan genocide ( Reference Yanagizawa-Drott Yanagizawa-Drott 2014 ), research on how radio propaganda incited anti-Semitic violence in Nazi Germany, ( Reference Adena, Enikolopov, Petrova, Santarosa and Zhuravskaya Adena et al. 2015 ), and a study of how nationalist Serbian radio was used to incite violence in Croatia in the 1990s ( Reference DellaVigna, Enikolopov, Mironova, Petrova and Zhuravskaya DellaVigna et al. 2014 ).

Examining the effects of online hate, Reference Chan, Ghose and Seamans Chan et al. (2015) find that broadband availability increases racial hate crimes in areas with higher levels of segregation and a higher proportion of racially charged Google search terms. Their work suggests that online access is increasing the incidence of racial hate crimes executed by lone wolf perpetrators. Similarly, Reference Stephens-Davidowitz Stephens-Davidowitz (2017 ) finds that the search rate on Google for anti-Muslim words and phrases, including violent terms like “kill all Muslims,” can be used to predict the incidence of anti-Muslim hate crimes over time. Other studies show an association between hateful speech on Twitter and hate crimes in the US context, but the causal links are not well identified ( Reference Williams, Burnap, Javed, Liu and Ozalp Williams et al. 2019 ; Reference Chyzh, Nieman and Webb Chyzh et al. 2019 ).

In one of the only existing studies that explicitly examines the causal link between online hate and offline violence, Reference Muller and Schwarz Muller and Schwarz (2017) exploit exogenous variation in major internet and Facebook outages to show that anti-refugee hate crimes increase disproportionately in areas with higher Facebook usage during periods of high anti-refugee sentiment online. They find that this effect is particularly pronounced for violent incidents against refugees, including arson and assault. Similarly, in a second paper, Reference Muller and Schwarz Muller and Schwarz (2019) exploit variation in early adoption of Twitter to show that higher Twitter usage is associated with an increase in anti-Muslim hate crimes since the start of Trump’s campaign. Their results provide preliminary evidence that social media can act as a propagation mechanism between online hate speech and offline violent crime. Together, this work suggests that online hate speech may have powerful real-world consequences, ranging from negative psychological effects at the individual level to violent attacks offline .

Combating Online Hate Speech

Rising concern regarding these real-world effects of online hate speech have prompted researchers, policymakers, and online platforms to develop strategies to combat online hate speech. These approaches have generally taken two forms: content moderation and counter-speech .

One strategy to combat online hate speech has been to moderate content, which involves banning accounts or communities that violate platforms’ terms of service or stated rules ( Reference Kiesler, Kraut, Resnick, Kittur, Kraut and Resnick Kiesler et al. 2012 ). On May 31, 2016, the European Commission in conjunction with Facebook, Twitter, YouTube, and Microsoft issued a voluntary Code of Conduct on Countering Illegal Hate Speech Online that required the removal of any hate speech, as defined by the European Union (EU). This was spurred by fears over a rise in intolerant speech against refugees as well as worries that hate speech fuels terror attacks ( Reference Aswad Aswad 2016 ). Additionally, beginning in December 2017, facing pressure in the aftermath of the deadly August 2017 “Unite the Right” march in Charlottesville, Virginia, Twitter announced a new policy to ban accounts that affiliate with groups “that use or promote violence against civilians to further their causes” ( Twitter 2017 ). The platform began by suspending several accounts with large followings involved in white nationalism or in organizing the Charlottesville march. In this period, Twitter also suspended a far-right British activist who had been retweeted by President Trump, as well as several other accounts affiliated with her ultranationalist group ( Reference Nedig Nedig 2017 ). The company announced that their ban on violent threats would also be extended to include any content that glorifies violence ( Twitter 2017 ). Similarly, in April 2018, Facebook announced its twenty-five-page set of rules dictating what types of content are permitted on Facebook ( 2018 ). The section on hate speech states, “We do not allow hate speech on Facebook because it creates an environment of intimidation and exclusion and in some cases may promote real-world violence.” The goal of banning hate speech from more mainstream online platforms is to reduce the likelihood that everyday internet users are incidentally exposed to online hate speech.

However, little is known about how these bans are actually implemented in practice or how effective they have been in reducing online hate speech on these platforms or exposure to such speech more broadly. Moreover, the use of automatic hate speech detection has come under fire in the media as the limits of these methods have been highlighted by embarrassing mistakes – like when Facebook’s proprietary filters flagged an excerpt from the Declaration of Independence as hate speech ( Reference Lapin Lapin 2018 ). While a February 2019 review by the European Commission suggests that social media platforms including Facebook and Google were successfully removing 75 percent of posts flagged by users that violate EU standards within 24 hours, we do not know what portion of hate speech is flagged or how this may be biased against or in favor of certain types of political speech ( Reference Laub Laub 2019 ).

Empirical work on the effectiveness of banning hateful content yields mixed results. Studying the effect of banning the /fatpeoplehate and /CoonTown subreddits on Reddit in 2015, Chandrasekharan, Pavalanathan et al. (2017) find the ban was successful. Analyzing more than 100 million Reddit posts and comments, the authors found that many accounts discontinued using the site after the ban, and those that stayed decreased their hate speech usage by at least 80 percent. Although many of these users migrated to other subreddits, the new subreddits did not experience an increase in hate speech usage, suggesting the ban was successful in limiting online hate speech on Reddit. Also on Reddit, Reference Saleem and Ruths Saleem and Ruths (2019 ) find that banning a large hateful subreddit (r/fatpeoplehate) prompted users of this subreddit to stop posting on Reddit. Similarly, other work suggests that banning accounts on Twitter disrupts extremist social networks, as users who are frequently banned suffer major drops in follower counts when they rejoin a particular platform ( Reference Berger and Perez Berger and Perez 2016 ).

That being said, although bans may have decreased the overall volume of hate speech on Redditt, and disrupted extremist activity on Twitter, such activity may have simply migrated to other platforms. In response to the 2015 bans, Reference Newell, Jurgens and Saleem Newell et al. (2016 ) find that disgruntled users sought out alternative platforms such as Voat, Snapzu, and Empeopled. Users who migrate to these fringe platforms often keep their usernames and attempt to recreate their banned communities in a new, less regulated domain (Chandrasekharan, Pavalanathan et al. 2017). In addition to moving hate speech from one platform to another, other work suggests that producers of harmful content simply become more creative about how to continue to use hate speech on their preferred platforms. For example, seeking to avoid content moderation, as previously described, members of online communities often use code words to circumvent detection ( Reference Chancellor, Pater, Clear, Gilbert and De Choudhury Chancellor et al. 2016 ; Reference Sonnad Sonnad 2016 ).

Additionally, attempts to ban user accounts may sometimes be counterproductive, galvanizing support from those who are sympathetic to hateful communities. When well-known users come under fire, people who hold similar beliefs may be motivated to rally to their defense and/or to express views that are opposed by powerful companies or organizations. For example, empirical studies of extremist behavior online examining pro-ISIS accounts suggest that online extremists view the blocking of their accounts as a badge of honor, and individuals who have been blocked or banned are often able to reactivate their accounts under new names ( Reference Vidino and Hughes Vidino and Hughes 2015 ; Reference Berger and Perez Berger and Perez 2016 ). Moreover, banning users often prompts them to move to more specialized platforms, such as Gab or Voat, which may further radicalize individuals who produce online hate. Indeed, banning hateful users removes them from diverse settings where they may come into contact with moderate or opposing voices, elevating their grievances and feelings of persecution and pushing them into hateful echo chambers where extremism and calls for offline violence are normalized and encouraged ( Reference Marwick and Lewis Marwick and Lewis 2017 ; Reference Lima, Reis and Melo Lima et al. 2018 ; Reference Zannettou, Bradlyn and De Cristofaro Zannettou et al. 2018 ; Reference Jackson Jackson 2019 ). While this is a compelling theoretical argument against banning users from mainstream platforms, more empirical work is needed to track the extent to which banned users migrate to more extreme platforms, as well as whether they indeed become further radicalized on these platforms ( Reference Jackson Jackson 2019 ).

In this way, existing empirical work on the effectiveness of content moderation suggests that, while it may reduce hate speech on particular platforms, as disgruntled users migrate to other corners of the Internet, it is unclear whether such efforts reduce hate speech overall. Moreover, thorny legal, ethical, and technical questions persist with regard to the benefits of banning hate speech on global social media platforms, particularly outside of Western democracies. For example, a recent ProPublica investigation found that Facebook’s rules are not transparent and inconsistently applied by tens of thousands of global contractors charged with content moderation. In many countries and disputed territories, such as the Palestinian territories, Kashmir, and Crimea, activists and journalists have been censored for harmful speech as Facebook has responded to government concerns and worked to insulate itself from legal liability. The report concluded that Facebook’s hate speech content moderation standards “tend to favor elites and governments over grassroots activists and racial minorities.” Along these lines, governments may declare opposition speech to be hateful or extremist in order to manipulate content moderation to silence their critics ( Reference Laub Laub 2019 ). Moreover, automated hate speech detection methods have not been well adapted to local contexts, and very few content moderators are employed that speak local languages – including those that are used to target at-risk minority groups who are often targeted by hate speech. In a famous example, in 2015, despite rising ethnic violence and rampant reports of hate speech on Facebook and other social media platforms targeting Muslims in Myanmar, Facebook allegedly just employed two Burmese-speaking content moderators ( Reference Stecklow Stecklow 2018 ).

Recognizing that censoring hate speech may come into conflict with legal protections of free speech or may be manipulated by governments to target critics, international agencies such as UNESCO have generally maintained that “the free flow of information should always be the norm.” As a result, they often argue that counter-speech is usually preferable to the suppression of hate speech ( Reference Gagliardone, Gal, Alves and Martinez Gagliardone et al. 2015 ). Counter-speech is a direct response to hate speech intended to influence discourse and behavior ( Reference Benesch Benesch 2014a , Reference Benesch and Grant 2014b ). Counter-speech campaigns have long been used to combat the public expression of hate speech and discrimination through traditional media channels . Examples of this in the US context include the use of anti-KKK billboards in the Deep South ( Reference Richards and Calvert Richards and Calvert 2000 ), and the dissemination of information about US hate groups by the Southern Poverty Law Center ( Reference McMillin McMillin 2014 ). Interventions designed to prevent the incitement of violence have also been deployed, including the use of soap operas to counter intergroup tensions in Rwanda and the use of television comedy in Kenya to discourage the use of hate speech ( Reference Staub, Pearlman and Miller Staub et al. 2003 ; Reference Paluck Paluck 2009 ; Reference Kogen Kogen 2013 ). Experimental evaluations of these interventions have found that they may make participants better able to recognize and resist incitement to anti–out-group hatred .

More recent work has explored the use of counter-speech in the online sphere. For example, fearing violence in the lead-up to the 2013 Kenyan elections, international NGOs, celebrities, and local businesses helped to fund “peace propaganda” campaigns to deter the spread of online hate speech – and offline violence – in Kenya. For example, one company offered cash and cell phone time to Kenyans who sent peace messages to each other online, including photos, poems, and stories ( Reference Benesch Benesch 2014a ). Demonstrating that counter-speech occurs organically on online platforms, in the aftermath of the 2015 Paris attacks, Reference Magdy, Darwish, Abokhodair, Rahimi and Baldwin Magdy et al. (2016) estimate that the vast majority of tweets posted following the attacks were defending Muslims, while anti-Muslim hate tweets represented a small fraction of content in the Twittersphere. Similarly, examining online hate speech in Nigerian political discussions, Reference Bartlett, Krasodomski-Jones, Daniel, Fisher and Jesperson Bartlett et al. (2015) find that extreme content is often met with disagreement, derision, and counter-messages.

A nascent strand of literature experimentally evaluates what types of counter-speech messages are most effective in reducing online hate speech. Reference Munger Munger (2017 ) shows that counter-speech using automated bots can reduce instances of racist speech if instigators are sanctioned by a high-status in-group member – in this case, a white male with a large number of Twitter followers. Similarly, Reference Siegel and Badaan Siegel and Badaan (2020 ) deployed a sockpuppet account to counter sectarian hate speech in the Arab Twittersphere. They find that simply receiving a sanctioning message reduces the use of hate speech, particularly for users in networks where hate speech is relatively uncommon. Moreover, they show that messages priming a common Muslim religious identity containing endorsements from elite actors are particularly effective in decreasing users’ posttreatment level of hate speech. Additional research is needed to further evaluate what types of counter-speech from what sources are most effective in reducing online hate in diverse contexts. Recognizing the potential of counter-speech bots, Reference Leetaru Leetaru (2017) proposed deploying AI bots en masse to fight online hate speech, though the feasibility and consequences of such an intervention are not well understood. Simulating how much counter-speech might be necessary to “drown out” hate speech on Facebook, Reference Schieb and Preuss Schieb and Preuss (2016 ) find that counter-speech can have a considerable impact on reducing the visibility of online hate speech, especially when producers of hate speech are in the minority of a particular community. In one of the only studies that explicitly detects naturally occurring counter-speech on social media, (Mathew et al. Reference Mathew, Kumar, Goyal and Mukherjee 2018 ; Reference Saha, Chandrasekharan and De Choudhury Mathew, Saha et al. 2019 ) find that counter-speech comments receive much more likes and engagement than other comments and may prompt produces of hate speech to apologize or change their behavior. More empirical work is needed, however, to see how this dynamic plays out more systematically on real-world social media platforms over time.

Explicitly comparing censorship or content monitoring to counter-speech interventions, Reference Alvarez-Benjumea and Winter Alvarez-Benjumea and Winter (2018 ) test whether decreasing social acceptability of hostile comments in an online forum decreases the use of hate speech. They first designed an online forum and invited participants to join and engage in conversation on current social topics. They then experimentally manipulated the comments participants observed before posting their own comments. They included a censoring treatment in which participants observed no hate comments and a counter-speech treatment in which hate speech comments were uncensored but were presented alongside posts highlighting the fact that hate speech was not considered acceptable on the platform. Comparing the level of hostility of the comments and instances of hate across the treatment conditions, they find that the censoring treatment was the most effective in reducing hostile comments. However, the authors note that the fact that they do not observe a statistically significant effect of the counter-speech treatment may be due to their small sample sizes and inability to monitor repeated interactions over time in their experimental setup. Together, this growing body of literature on the effects of censoring and counter-speech on online hate speech provides some optimism, particularly regarding the impact of content moderation on reducing hate speech on mainstream platforms and the ability of counter-speech campaigns to decrease the reach, visibility, and harm of online hate speech. However, we know very little about the potential collateral damage of these interventions. Future work should not only provide larger scale empirical tests of these types of interventions in diverse contexts but seek to evaluate the longer-term effects of these approaches .

Conclusions and Steps for Future Research

As online hate speech has become increasingly visible on social media platforms, it has emerged at the center of academic, legal, and policy agendas. Despite increased attention to online hate speech, as this chapter demonstrates, the debate over how to define online hate speech is far from settled. Partly as a consequence of these definitional challenges, and partly as a result of the highly context-specific and evolving nature of online hate speech, detecting hateful content systematically is an extremely difficult task.

While state-of-the-art techniques employing machine learning, neural networks and incorporating contextual features have improved our ability to measure and monitor online hate speech, most existing empirical work is fairly fragmented – often detecting a single type of hate speech on one platform at one moment of time. Moreover, because of ease of data collection, the vast majority of studies have been conducted using English-language Twitter data and therefore do not necessarily tell us very much about other platforms or cultural contexts. Adding further complications, definitions of hate speech and approaches to detecting it are highly politicized, particularly in authoritarian contexts and conflict settings. Though some research has explored multiple types of hate speech, used several datasets, conducted research on multiple platforms, or examined trends in hate speech over time, these studies are the exception rather than the rule ( Reference Fortuna Fortuna 2017 ). Drawing on the rich literature of hate speech detection techniques in computer science and social science, future work should attempt more systematic comparative analysis to improve our ability to detect online hate speech in its diverse forms .

Though less developed than the literature on defining and measuring online hate speech, recent work has explored both the producers of online hate speech and their targets. A large body of literature has evaluated how hate groups strategically use the Internet to lure recruits and foster a sense of community among disparate members, using primarily small-scale qualitative analysis of data from hate groups’ official websites ( Reference Selepak Selepak 2010 ). Other work has conducted large-scale observational studies of the users that produce hate speech on mainstream social media platforms like Twitter and Reddit, including their demographic characteristics and network structures. These users tend to be young, male, very active on social media, and members of tightly networked communities in which producers of hate speech frequently retweet and like each other’s posts ( Reference Costello and Hawdon Costello and Hawdon 2018 ; Reference Ribeiro, Calais, Santos, Almeida and Meira Ribeiro et al . 2018 ).

With regard to the targets of hate speech, researchers have used both big data empirical analyses and surveys of the users targeted online to demonstrate that targets of hate speech are often prominent social media users with large followings ( Reference ElSherief, Nilizadeh, Nguyen, Vigna and Belding ElSherief, Nilizadeh et al. 2018 ). Additionally, qualitative and quantitative work demonstrates that one targeting strategy on mainstream social media platforms is for well-organized groups of users to launch coordinated hate attacks or “raids” on bloggers, celebrities, journalists, or other prominent actors ( Reference Mariconti, Suarez-Tangil and Blackburn Mariconti et al. 2018 ). This may be one reason why online hate speech has received so much attention in the mainstream media, despite empirical evidence suggesting that hate speech is actually quite rare on mainstream social media platforms in aggregate.

Indeed, quantitative work evaluating the prevalence of online hate speech suggests that it may represent only a fraction of a percentage point of overall posts on sites like Facebook and Twitter ( Reference Gagliardone, Pohjonen and Beyene Gagliardone et al. 2016 ; Reference Siegel, Nitikin and Barberá Siegel et al. 2020 ). Moreover, studies exploring the dynamics of online hate speech over time on Twitter suggest that it is quite bursty – it increases in response to emotional or violent events and then tends to quickly re-equilibrate ( Reference Awan and Zempi Awan and Zempi 2015 ; Reference Olteanu, Castillo, Boy and Varshney Olteanu et al. 2018 ; Reference Siegel, Nitikin and Barberá Siegel et al. 2020 ).

Although hate speech may be rare, it can still have severe offline consequences. Survey data suggest that online hate speech negatively impacts the psychological well-being of individuals who are exposed to it and can have detrimental consequences for intergroup relations at the societal level ( Reference Tynes, Giang, Williams and Thompson Tynes et al. 2008 ). A growing body of empirical evidence also suggests that online hate speech can incite people to violence and that it may be playing a particularly devastating role in fueling attacks on Muslim immigrants and refugees. Recent work exploring the causal effect of online hate speech on offline attitudes and behaviors ( Reference Chan, Ghose and Seamans Chan et al. 2015 ; Reference Muller and Schwarz Muller and Schwarz 2017 ; Reference Muller and Schwarz Muller and Schwarz 2019 ) should be replicated, expanded, and adapted to enable us to better understand these dynamics in other contexts and over longer periods of time .

Scientific studies have also assessed what strategies might be most effective to combat online hate speech. Empirical evidence suggests that banning hateful communities on Reddit, for example, reduced the volume of hate speech on the platform overall (Chandrasekharan, Pavalanathan et al. 2017). However, other work indicates that users who are banned from discussing particular topics on mainstream platforms simply move elsewhere to continue their hateful discourse ( Reference Newell, Jurgens and Saleem Newell et al. 2016 ). Additionally, content and account bans could have galvanizing effects for certain extremist actors who view the sanction as a badge of honor ( Reference Vidino and Hughes Vidino and Hughes 2015 ). More optimistically, experimental research using counter-speech to combat online hate speech suggests that receiving sanctioning messages from other Twitter users – particularly fellow in-group members, high-status individuals, or trusted elite actors – discourages users from tweeting hateful content ( Reference Munger Munger 2017 ; Reference Siegel and Badaan Siegel and Badaan 2020 ). Moreover, large-scale empirical studies suggest that counter-speech is quite common in the online sphere, and the same events that trigger upticks in online hate speech often trigger much larger surges in counter-speech ( Reference Magdy, Darwish, Abokhodair, Rahimi and Baldwin Magdy et al. 2016 ; Reference Olteanu, Castillo, Boy and Varshney Olteanu et al. 2018 ). Future work should continue to explore what kinds of counter-speech might be most effective in diverse cultural contexts and on different platforms, as well as how counter-speech can be encouraged among everyday social media users. Given the dangerous offline consequences of online hate speech in diverse global contexts, academics and policymakers should continue to build on this existing literature to improve hate speech detection, gain a more comprehensive understanding of how hate speech arises and spreads, develop further understanding of hate speech’s offline consequences, and build better tools to effectively combat it.

Save book to Kindle

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle .

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service .

  • Online Hate Speech
  • By Alexandra A. Siegel
  • Edited by Nathaniel Persily , Stanford University, California , Joshua A. Tucker , New York University
  • Book: Social Media and Democracy
  • Online publication: 24 August 2020

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox .

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive .

  • DOI: 10.1007/S11406-017-9858-4
  • Corpus ID: 148707841

Hate Speech on Social Media

  • A. Guiora , Elizabeth A. Park
  • Published in Philosophia 27 July 2017
  • Law, Political Science

43 Citations

Perceived exposure to and avoidance of hate speech in various communication settings, hate speech detection for a low level language (assamese), communicating covid-19 on social media: the effects of the spiral of silence, an evaluation of the automatic detection of hate speech in social media networks, unity starts with u: a case study of a counter-hate campaign through the use of social media platforms, expressing rivalry online: antisemitic rhetoric among dutch football supporters on twitter, hate speech and the expression of stereotypes as products of social cognition in nigeria, patterns of incivility on u.s. congress members' social media accounts: a comprehensive analysis of the influence of platform, post, and person characteristics, hate speech in social media: personal experiences and perceptions of university students in bangladesh, mining public opinion on radicalism in social media via sentiment analysis, related papers.

Showing 1 through 3 of 0 Related Papers

Democracy, Social Media, and Freedom of Expression: Hate, Lies, and the Search for the Possible Truth

  • Share Chicago Journal of International Law | Democracy, Social Media, and Freedom of Expression: Hate, Lies, and the Search for the Possible Truth on Facebook
  • Share Chicago Journal of International Law | Democracy, Social Media, and Freedom of Expression: Hate, Lies, and the Search for the Possible Truth on Twitter
  • Share Chicago Journal of International Law | Democracy, Social Media, and Freedom of Expression: Hate, Lies, and the Search for the Possible Truth on Email
  • Share Chicago Journal of International Law | Democracy, Social Media, and Freedom of Expression: Hate, Lies, and the Search for the Possible Truth on LinkedIn

Download PDF

This Essay is a critical reflection on the impact of the digital revolution and the internet on three topics that shape the contemporary world: democracy, social media, and freedom of expression. Part I establishes historical and conceptual assumptions about constitutional democracy and discusses the role of digital platforms in the current moment of democratic recession. Part II discusses how, while social media platforms have revolutionized interpersonal and social communication and democratized access to knowledge and information, they also have led to an exponential spread of mis- and disinformation, hate speech, and conspiracy theories. Part III proposes a framework that balances regulation of digital platforms with the countervailing fundamental right to freedom of expression, a right that is essential for human dignity, the search for the possible truth, and democracy. Part IV highlights the role of society and the importance of media education in the creation of a free, but positive and constructive, environment on the internet.

I. Introduction

Before the internet, few actors could afford to participate in public debate due to the barriers that limited access to its enabling infrastructure, such as television channels and radio frequencies. 1 Digital platforms tore down this gate by creating open online communities for user-generated content, published without editorial control and at no cost. This exponentially increased participation in public discourse and the amount of information available. 2 At the same time, it led to an increase in disinformation campaigns, hate speech, slander, lies, and conspiracy theories used to advance antidemocratic goals. Platforms’ attempts to moderate speech at scale while maximizing engagement and profits have led to an increasingly prominent role for content moderation algorithms that shape who can participate and be heard in online public discourse. These systems play an essential role in the exercise of freedom of expression and in democratic competence and participation in the 21st century.

In this context, this Essay is a critical reflection on the impacts of the digital revolution and of the internet on democracy and freedom of expression. Part I establishes historical and conceptual assumptions about constitutional democracy; it also discusses the role of digital platforms in the current moment of democratic recession. Part II discusses how social media platforms are revolutionizing interpersonal and social communication, and democratizing access to knowledge and information, but also lead to an exponential spread of mis- and disinformation, hate speech and conspiracy theories. Part III proposes a framework for the regulation of digital platforms that seeks to find the right balance with the countervailing fundamental right to freedom of expression. Part IV highlights the role of society and the importance of media education in the creation of a free, but positive and constructive, environment on the internet.

II. Democracy and Authoritarian Populism

Constitutional democracy emerged as the predominant ideology of the 20th century, rising above the alternative projects of communism, fascism, Nazism, military regimes, and religious fundamentalism . 3 Democratic constitutionalism centers around two major ideas that merged at the end of the 20th century: constitutionalism , heir of the liberal revolutions in England, America, and France, expressing the ideas of limited power, rule of law, and respect for fundamental rights; 4 and democracy , a regime of popular sovereignty, free and fair elections, and majority rule. 5 In most countries, democracy only truly consolidated throughout the 20th century through universal suffrage guaranteed with the end of restrictions on political participation based on wealth, education, sex, or race. 6

Contemporary democracies are made up of votes, rights, and reasons. They are not limited to fair procedural rules in the electoral process, but demand respect for substantive fundamental rights of all citizens and a permanent public debate that informs and legitimizes political decisions. 7 To ensure protection of these three aspects, most democratic regimes include in their constitutional framework a supreme court or constitutional court with jurisdiction to arbitrate the inevitable tensions that arise between democracy’s popular sovereignty and constitutionalism’s fundamental rights. 8 These courts are, ultimately, the institutions responsible for protecting fundamental rights and the rules of the democratic game against any abuse of power attempted by the majority. Recent experiences in Hungary, Poland, Turkey, Venezuela, and Nicaragua show that when courts fail to fulfill this role, democracy collapses or suffers major setbacks. 9

In recent years, several events have challenged the prevalence of democratic constitutionalism in many parts of the world, in a phenomenon characterized by many as democratic recession. 10 Even consolidated democracies have endured moments of turmoil and institutional discredit, 11 as the world witnessed the rise of an authoritarian, anti-pluralist, and anti-institutional populist wave posing serious threats to democracy.

Populism can be right-wing or left-wing, 12 but the recent wave has been characterized by the prevalence of right-wing extremism, often racist, xenophobic, misogynistic, and homophobic. 13 While in the past the far left was united through Communist International, today it is the far right that has a major global network. 14 The hallmark of right-wing populism is the division of society into “us” (the pure, decent, conservatives) and “them” (the corrupt, liberal, cosmopolitan elites). 15 Authoritarian populism flows from the unfulfilled promises of democracy for opportunities and prosperity for all. 16 Three aspects undergird this democratic frustration: political (people do not feel represented by the existing electoral systems, political leaders, and democratic institutions); social (stagnation, unemployment, and the rise of inequality); and cultural identity (a conservative reaction to the progressive identity agenda of human rights that prevailed in recent decades with the protection of the fundamental rights of women, African descendants, religious minorities, LGBTQ+ communities, indigenous populations, and the environment). 17

Extremist authoritarian populist regimes often adopt similar strategies to capitalize on the political, social, and cultural identity-based frustrations fueling democratic recessions. These tactics include by-pass or co-optation of the intermediary institutions that mediate the interface between the people and the government, such as the legislature, the press, and civil society. They also involve attacks on supreme courts and constitutional courts and attempts to capture them by appointing submissive judges. 18 The rise of social media potentializes these strategies by creating a free and instantaneous channel of direct communication between populists and their supporters. 19 This unmediated interaction facilitates the use of disinformation campaigns, hate speech, slander, lies, and conspiracy theories as political tools to advance antidemocratic goals. The instantaneous nature of these channels is ripe for impulsive reactions, which facilitate verbal attacks by supporters and polarization, feeding back into the populist discourse. These tactics threaten democracy and free and fair elections because they deceive voters and silence the opposition, distorting public debate. Ultimately, this form of communication undermines the values that justify the special protection of freedom of expression to begin with. The “truth decay” and “fact polarization” that result from these efforts discredit institutions and consequently foster distrust in democracy. 20

III. Internet, Social Media, and Freedom of Expression 21

The third industrial revolution, also known as the technological or digital revolution, has shaped our world today. 22 Some of its main features are the massification of personal computers, the universalization of smartphones and, most importantly, the internet. One of the main byproducts of the digital revolution and the internet was the emergence of social media platforms such as Facebook, Instagram, YouTube, TikTok and messaging applications like WhatsApp and Telegram. We live in a world of apps, algorithms, artificial intelligence, and innovation occurring at breakneck speed where nothing seems truly new for very long. This is the background for the narrative that follows.

A. The Impact of the Internet

The internet revolutionized the world of interpersonal and social communication, exponentially expanded access to information and knowledge, and created a public sphere where anyone can express ideas, opinions, and disseminate facts. 23 Before the internet, one’s participation in public debate was dependent upon the professional press, 24 which investigated facts, abided by standards of journalistic ethics, 25 and was liable for damages if it knowingly or recklessly published untruthful information. 26 There was a baseline of editorial control and civil liability over the quality and veracity of what was published in this medium. This does not mean that it was a perfect world. The number of media outlets was, and continues to be, limited in quantity and perspectives; journalistic companies have their own interests, and not all of them distinguish fact from opinion with the necessary care. Still, there was some degree of control over what became public, and there were costs to the publication of overtly hateful or false speech.

The internet, with the emergence of websites, personal blogs, and social media, revolutionized this status quo. It created open, online communities for user-generated texts, images, videos, and links, published without editorial control and at no cost. This advanced participation in public discourse, diversified sources, and exponentially increased available information. 27 It gave a voice to minorities, civil society, politicians, public agents, and digital influencers, and it allowed demands for equality and democracy to acquire global dimensions. This represented a powerful contribution to political dynamism, resistance to authoritarianism, and stimulation of creativity, scientific knowledge, and commercial exchanges. 28 Increasingly, the most relevant political, social, and cultural communications take place on the internet’s unofficial channels.

However, the rise of social media also led to an increase in the dissemination of abusive and criminal speech. 29 While these platforms did not create mis- or disinformation, hate speech, or speech that attacks democracy, the ability to publish freely, with no editorial control and little to no accountability, increased the prevalence of these types of speech and facilitated its use as a political tool by populist leaders. 30 Additionally, and more fundamentally, platform business models compounded the problem through algorithms that moderate and distribute online content. 31

B. The Role of Algorithms

The ability to participate and be heard in online public discourse is currently defined by the content moderation algorithms of a couple major technology companies. Although digital platforms initially presented themselves as neutral media where users could publish freely, they in fact exercise legislative, executive, and judicial functions because they unilaterally define speech rules in their terms and conditions and their algorithms decide how content is distributed and how these rules are applied. 32

Specifically, digital platforms rely on algorithms for two different functions: recommending content and moderating content. 33 First, a fundamental aspect of the service they offer involves curating the content available to provide each user with a personalized experience and increase time spent online. They resort to deep learning algorithms that monitor every action on the platform, draw from user data, and predict what content will keep a specific user engaged and active based on their prior activity or that of similar users. 34 The transition from a world of information scarcity to a world of information abundance generated fierce competition for user attention—the most valuable resource in the Digital Age. 35 The power to modify a person’s information environment has a direct impact on their behavior and beliefs. Because AI systems can track an individual’s online history, they can tailor specific messages to maximize impact. More importantly, they monitor whether and how the user interacts with the tailored message, using this feedback to influence future content targeting and progressively becoming more effective in shaping behavior. 36 Given that humans engage more with content that is polarizing and provocative, these algorithms elicit powerful emotions, including anger. 37 The power to organize online content therefore directly impacts freedom of expression, pluralism, and democracy. 38

In addition to recommendation systems, platforms rely on algorithms for content moderation, the process of classifying content to determine whether it violates community standards. 39 As mentioned, the growth of social media and its use by people around the world allowed for the spread of lies and criminal acts with little cost and almost no accountability, threatening the stability of even long-standing democracies. Inevitably, digital platforms had to enforce terms and conditions defining the norms of their digital community and moderate speech accordingly. 40 But the potentially infinite amount of content published online means that this control cannot be exercised exclusively by humans.

Content moderation algorithms optimize the scanning of published content to identify violations of community standards or terms of service at scale and apply measures ranging from removal to reducing reach or including clarifications or references to alternative information. Platforms often rely on two algorithmic models for content moderation. The first is the reproduction detection model , which uses unique identifiers to catch reproductions of content previously labeled as undesired. 41 The second system, the predictive model , uses machine learning techniques to identify potential illegalities in new and unclassified content. 42 Machine learning is a subtype of artificial intelligence that extracts patterns in training datasets, capable of learning from data without explicit programming to do so. 43 Although helpful, both models have shortcomings.

The reproduction detection model is inefficient for content such as hate speech and disinformation, where the potential for new and different publications is virtually unlimited and users can deliberately make changes to avoid detection. 44 The predictive model is still limited in its ability to address situations to which it has not been exposed in training, primarily because it lacks the human ability to understand nuance and to factor in contextual considerations that influence the meaning of speech. 45 Additionally, machine learning algorithms rely on data collected from the real world and may embed prejudices or preconceptions, leading to asymmetrical applications of the filter. 46 And because the training data sets are so large, it can be hard to audit them for these biases. 47

Despite these limitations, algorithms will continue to be a crucial resource in content moderation given the scale of online activities. 48 In the last two months of 2020 alone, Facebook applied a content moderation measure to 105 million publications, and Instagram to 35 million. 49 YouTube has 500 hours of video uploaded per minute and removed more than 9.3 million videos. 50 In the first half of 2020, Twitter analyzed complaints related to 12.4 million accounts for potential violations of its rules and took action against 1.9 million. 51 This data supports the claim that human moderation is impossible, and that algorithms are a necessary tool to reduce the spread of illicit and harmful content. On the one hand, holding platforms accountable for occasional errors in these systems would create wrong incentives to abandon algorithms in content moderation with the negative consequence of significantly increasing the spread of undesired speech. 52 On the other hand, broad demands for platforms to implement algorithms to optimize content moderation, or laws that impose very short deadlines to respond to removal requests submitted by users, can create excessive pressure for the use of these imprecise systems on a larger scale. Acknowledging the limitations of this technology is fundamental for precise regulation.

C. Some Undesirable Consequences

One of the most striking impacts of this new informational environment is the exponential increase in the scale of social communications and the circulation of news. Around the world, few newspapers, print publications, and radio stations cross the threshold of having even one million subscribers and listeners. This suggests the majority of these publications have a much smaller audience, possibly in the thousands or tens of thousands of people. 53 Television reaches millions of viewers, although diluted among dozens or hundreds of channels. 54 Facebook, on the other hand, has about 3 billion active users. 55 YouTube has 2.5 billion accounts. 56 WhatsApp, more than 2 billion. 57 The numbers are bewildering. However, and as anticipated, just as the digital revolution democratized access to knowledge, information, and public space, it also introduced negative consequences for democracy that must be addressed. Three of them include:

a) the increased circulation of disinformation, deliberate lying, hate speech, conspiracy theories, attacks on democracy, and inauthentic behavior, made possible by recommendation algorithms that optimize for user engagement and content moderation algorithms that are still incapable of adequately identifying undesirable content;
b) the tribalization of life, with the formation of echo chambers where groups speak only to themselves, reinforcing confirmation bias, 58 making speech progressively more radical, and contributing to polarization and intolerance; and
c) a global crisis in the business model of the professional press. Although social media platforms have become one of the main sources of information, they do not produce their own content. They hire engineers, not reporters, and their interest is engagement, not news. 59 Because advertisers’ spending has migrated away from traditional news publications to technological platforms with broader reaches, the press has suffered from a lack of revenue which has forced hundreds of major publications, national and local, to close their doors or reduce their journalist workforce. 60 But a free and strong press is more than just a private business; it is a pillar for an open and free society. It serves a public interest in the dissemination of facts, news, opinions, and ideas, indispensable preconditions for the informed exercise of citizenship. Knowledge and truth—never absolute, but sincerely sought—are essential elements for the functioning of a constitutional democracy. Citizens need to share a minimum set of common objective facts from which to inform their own judgments. If they cannot accept the same facts, public debate becomes impossible. Intolerance and violence are byproducts of the inability to communicate—hence the importance of “knowledge institutions,” such as universities, research entities, and the institutional press. The value of free press for democracy is illustrated by the fact that in different parts of the world, the press is one of the only private businesses specifically referred to throughout constitutions. Despite its importance for society and democracy, surveys reveal a concerning decline in its prestige. 61

In the beginning of the digital revolution, there was a belief that the internet should be a free, open, and unregulated space in the interest of protecting access to the platform and promoting freedom of expression. Over time, concerns emerged, and a consensus gradually grew for the need for internet regulation. Multiple approaches for regulating the internet were proposed, including: (a) economic, through antitrust legislation, consumer protection, fair taxation, and copyright rules; (b) privacy, through laws restricting collection of user data without consent, especially for content targeting; and (c) targeting inauthentic behavior, content control, and platform liability rules. 62

Devising the proper balance between the indispensable preservation of freedom of expression on the one hand, and the repression of illegal content on social media on the other, is one of the most complex issues of our generation. Freedom of expression is a fundamental right incorporated into virtually all contemporary constitutions and, in many countries, is considered a preferential freedom. Several reasons have been advanced for granting freedom of expression special protection, including its roles: (a) in the search for the possible truth 63 in an open and plural society, 64 as explored above in discussing the importance of the institutional press; (b) as an essential element for democracy 65 because it allows the free circulation of ideas, information, and opinions that inform public opinion and voting; and (c) as an essential element of human dignity, 66 allowing the expression of an individual’s personality.

The regulation of digital platforms cannot undermine these values but must instead aim at its protection and strengthening. However, in the digital age, these same values that historically justified the reinforced protection of freedom of expression can now justify its regulation. As U.N. Secretary-General António Guterres thoughtfully stated, “the ability to cause large-scale disinformation and undermine scientifically established facts is an existential risk to humanity.” 67

Two aspects of the internet business model are particularly problematic for the protection of democracy and free expression. The first is that, although access to most technological platforms and applications is free, users pay for access with their privacy. 68 As Lawrence Lessig observed, we watch television, but the internet watches us. 69 Everything each individual does online is monitored and monetized. Data is the modern gold. 70 Thus, those who pay for the data can more efficiently disseminate their message through targeted ads. As previously mentioned, the power to modify a person’s information environment has a direct impact on behavior and beliefs, especially when messages are tailored to maximize impact on a specific individual. 71

The second aspect is that algorithms are programmed to maximize time spent online. This often leads to the amplification of provocative, radical, and aggressive content. This in turn compromises freedom of expression because, by targeting engagement, algorithms sacrifice the search for truth (with the wide circulation of fake news), democracy (with attacks on institutions and defense of coups and authoritarianism), and human dignity (with offenses, threats, racism, and others). The pursuit of attention and engagement for revenue is not always compatible with the values that underlie the protection of freedom of expression.

IV. A Framework for the Regulation of Social Media

Platform regulation models can be broadly classified into three categories: (a) state or government regulation, through legislation and rules drawing a compulsory, encompassing framework; (b) self-regulation, through rules drafted by platforms themselves and materialized in their terms of use; and (c) regulated self-regulation or coregulation, through standards fixed by the state but which grant platform flexibility in materializing and implementing them. This Essay argues for the third model, with a combination of governmental and private responsibilities. Compliance should be overseen by an independent committee, with the minority of its representatives coming from the government, and the majority coming from the business sector, academia, technology entities, users, and civil society.

The regulatory framework should aim to reduce the asymmetry of information between platforms and users, safeguard the fundamental right to freedom of expression from undue private or state interventions, and protect and strengthen democracy. The current technical limitations of content moderation algorithms explored above and normal substantive disagreement about what content should be considered illegal or harmful suggest that an ideal regulatory model should optimize the balance between the fundamental rights of users and platforms, recognizing that there will always be cases where consensus is unachievable. The focus of regulation should be the development of adequate procedures for content moderation, capable of minimizing errors and legitimizing decisions even when one disagrees with the substantive result. 72 With these premises as background, the proposal for regulation formulated here is divided into three levels: (a) the appropriate intermediary liability model for user-generated content; (b) procedural duties for content moderation; and (c) minimum duties to moderate content that represents concrete threats to democracy and/or freedom of expression itself.

A. Intermediary Liability for User-Generated Content

There are three main regimes for platform liability for third-party content. In strict liability models, platforms are held responsible for all user-generated posts. 73 Since platforms have limited editorial control over what is posted and limited human oversight over the millions of posts made daily, this would be a potentially destructive regime. In knowledge-based liability models, platform liability arises if they do not act to remove content after an extrajudicial request from users—this is also known as a “notice-and-takedown” system. 74 Finally, a third model would make platforms liable for user-generated content only in cases of noncompliance with a court order mandating content removal. This latter model was adopted in Brazil with the Civil Framework for the Internet (Marco Civil da Internet). 75 The only exception in Brazilian legislation to this general rule is revenge porn: if there is a violation of intimacy resulting from the nonconsensual disclosure of images, videos, or other materials containing private nudity or private sexual acts, extrajudicial notification is sufficient to create an obligation for content removal under penalty of liability. 76

In our view, the Brazilian model is the one that most adequately balances the fundamental rights involved. As mentioned, in the most complex cases concerning freedom of expression, people will disagree on the legality of speech. Rules holding platforms accountable for not removing content after mere user notification create incentives for over-removal of any potentially controversial content, excessively restricting users’ freedom of expression. If the state threatens to hold digital platforms accountable if it disagrees with their assessment, companies will have the incentive to remove all content that could potentially be considered illicit by courts to avoid liability. 77

Nonetheless, this liability regime should coexist with a broader regulatory structure imposing principles, limits, and duties on content moderation by digital platforms, both to increase the legitimacy of platforms’ application of their own terms and conditions and to minimize the potentially devastating impacts of illicit or harmful speech.

B. Standards for Proactive Content Moderation

Platforms have free enterprise and freedom of expression rights to set their own rules and decide the kind of environment they want to create, as well as to moderate harmful content that could drive users away. However, because these content moderation algorithms are the new governors of the public sphere, 78 and because they define the ability to participate and be heard in online public discourse, platforms should abide by minimum procedural duties of transparency and auditing, due process, and fairness.

1. Transparency and Auditing

Transparency and auditing measures serve mainly to ensure that platforms are accountable for content moderation decisions and for the impacts of their algorithms. They provide users with greater understanding and knowledge about the extent to which platforms regulate speech, and they provide oversight bodies and researchers with information to understand the threats of digital services and the role of platforms in amplifying or minimizing them.

Driven by demands from civil society, several digital platforms already publish transparency reports. 79 However, the lack of binding standards means that these reports have significant gaps, no independent verification of the information provided, 80 and no standardization across platforms, preventing comparative analysis. 81 In this context, regulatory initiatives that impose minimum requirements and standards are crucial to make oversight more effective. On the other hand, overly broad transparency mandates may force platforms to adopt simpler content moderation rules to reduce costs, which could negatively impact the accuracy of content moderation or the quality of the user experience. 82 A tiered approach to transparency, where certain information is public and certain information is limited to oversight bodies or previously qualified researchers, ensures adequate protection of countervailing interests, such as user privacy and business confidentiality. 83 The Digital Services Act, 84 recently passed in the European Union, contains robust transparency provisions that generally align with these considerations. 85

The information that should be publicly provided includes clear and unambiguous terms of use, the options available to address violations (such as removal, amplification reduction, clarifications, and account suspension) and the division of labor between algorithms and humans. More importantly, public transparency reports should include information on the accuracy of automated moderation measures and the number of content moderation actions broken down by type (such as removal, blocking, and account deletion). 86 There must also be transparency obligations to researchers, giving them access to crucial information and statistics, including to the content analyzed for the content moderation decisions. 87

Although valuable, transparency requirements are insufficient in promoting accountability because they rely on users and researchers to actively monitor platform conduct and presuppose that they have the power to draw attention to flaws and promote changes. 88 Legally mandated third-party algorithmic auditing is therefore an important complement to ensure that these models satisfy legal, ethical, and safety standards and to elucidate the embedded value tradeoffs, such as between user safety and freedom of expression. 89 As a starting point, algorithm audits should consider matters such as how accurately they perform, any potential bias or discrimination incorporated in the data, and to what extent the internal mechanics are explainable to humans. 90 The Digital Services Act contains a similar proposal. 91

The market for algorithmic auditing is still emergent and replete with uncertainty. In attempting to navigate this scenario, regulators should: (a) define how often the audits should happen; (b) develop standards and best practices for auditing procedures; (c) mandate specific disclosure obligations so auditors have access to the required data; and (d) define how identified harms should be addressed. 92

2. Due Process and Fairness

To ensure due process, platforms must inform users affected by content moderation decisions of the allegedly violated provision of the terms of use, as well as offer an internal system of appeals against these decisions. Platforms must also create systems that allow for the substantiated denunciation of content or accounts by other users, and notify reporting users of the decision taken.

As for fairness, platforms should ensure that the rules are applied equally to all users. Although it is reasonable to suppose that platforms may adopt different criteria for public persons or information of public interest, these exceptions must be clear in the terms of use. This issue has recently been the subject of controversy between the Facebook Oversight Board and the company. 93

Due to the enormous amount of content published on the platforms and the inevitability of using automated mechanisms for content moderation, platforms should not be held accountable for a violation of these duties in specific cases, but only when the analysis reveals a systemic failure to comply. 94

C. Minimum Duties to Moderate Illicit Content

The regulatory framework should also contain specific obligations to address certain types of especially harmful speech. The following categories are considered by the authors to fall within this group: disinformation, hate speech, anti-democratic attacks, cyberbullying, terrorism, and child pornography. Admittedly, defining and consensually identifying the speech included in these categories—except in the case of child pornography 95 —is a complex and largely subjective task. Precisely for this reason, platforms should be free to define how the concepts will be operationalized, as long as they guide definitions by international human rights parameters and in a transparent manner. This does not mean that all platforms will reach the same definitions nor the same substantive results in concrete cases, but this should not be considered a flaw in the system, since the plurality of rules promotes freedom of expression. The obligation to observe international human rights parameters reduces the discretion of companies, while allowing for the diversity of policies among them. After defining these categories, platforms must establish mechanisms that allow users to report violations.

In addition, platforms should develop mechanisms to address coordinated inauthentic behaviors, which involve the use of automated systems or deceitful means to artificially amplify false or dangerous messages by using bots, fake profiles, trolls, and provocateurs. 96 For example, if a person publishes a post for his twenty followers saying that kerosene oil is good for curing COVID-19, the negative impact of this misinformation is limited. However, if that message is amplified to thousands of users, a greater public health issue arises. Or, in another example, if the false message that an election was rigged reaches millions of people, there is a democratic risk due to the loss of institutional credibility.

The role of oversight bodies should be to verify that platforms have adopted terms of use that prohibit the sharing of these categories of speech and ensure that, systemically, the recommendation and content moderation systems are trained to moderate this content.

V. Conclusion

The World Wide Web has provided billions of people with access to knowledge, information, and the public space, changing the course of history. However, the misuse of the internet and social media poses serious threats to democracy and fundamental rights. Some degree of regulation has become necessary to confront inauthentic behavior and illegitimate content. It is essential, however, to act with transparency, proportionality, and adequate procedures, so that pluralism, diversity, and freedom of expression are preserved.

In addition to the importance of regulatory action, the responsibility for the preservation of the internet as a healthy public sphere also lies with citizens. Media education and user awareness are fundamental steps for the creation of a free but positive and constructive environment on the internet. Citizens should be conscious that social media can be unfair, perverse, and can violate fundamental rights and basic rules of democracy. They must be attentive not to uncritically pass on all information received. Alongside states, regulators, and tech companies, citizens are also an important force to address these threats. In Jonathan Haidt’s words, “[w]hen our public square is governed by mob dynamics unrestrained by due process, we don’t get justice and inclusion; we get a society that ignores context, proportionality, mercy, and truth.” 97

  • 1 Tim Wu, Is the First Amendment Obsolete? , in The Perilous Public Square 15 (David E. Pozen ed., 2020).
  • 2 Jack M. Balkin, Free Speech is a Triangle , 118 Colum. L. Rev. 2011, 2019 (2018).
  • 3 Luís Roberto Barroso, O Constitucionalismo Democrático ou Neoconstitucionalismo como ideologia vitoriosa do século XX , 4 Revista Publicum 14, 14 (2018).
  • 4 Id. at 16.
  • 7 Ronald Dworkin, Is Democracy Possible Here?: Principles for a New Political Debate xii (2006); Ronald Dworkin, Taking Rights Seriously 181 (1977).
  • 8 Barroso, supra note 3, at 16.
  • 9 Samuel Issacharoff, Fragile Democracies: Contested Power in the Era of Constitutional Courts i (2015).
  • 10 Larry Diamond, Facing up to the Democratic Recession , 26 J. Democracy 141 (2015). Other scholars have referred to the same phenomenon using other terms, such as democratic retrogression, abusive constitutionalism, competitive authoritarianism, illiberal democracy, and autocratic legalism. See, e.g. , Aziz Huq & Tom Ginsburg, How to Lose a Constitutional Democracy , 65 UCLA L. Rev. 91 (2018); David Landau, Abusive Constitutionalism , 47 U.C. Davis L. Rev. 189 (2013); Kim Lane Scheppele, Autocratic Legalism , 85 U. Chi. L. Rev. 545 (2018).
  • 11 Dan Balz, A Year After Jan. 6, Are the Guardrails that Protect Democracy Real or Illusory? , Wash. Post (Jan. 6, 2022), https://perma.cc/633Z-A9AJ; Brexit: Reaction from Around the UK , BBC News (June 24, 2016), https://perma.cc/JHM3-WD7A.
  • 12 Cas Mudde, The Populist Zeitgeist , 39 Gov’t & Opposition 541, 549 (2004).
  • 13 See generally Mohammed Sinan Siyech, An Introduction to Right-Wing Extremism in India , 33 New Eng. J. Pub. Pol’y 1 (2021) (discussing right-wing extremism in India). See also Eviane Leidig, Hindutva as a Variant of Right-Wing Extremism , 54 Patterns of Prejudice 215 (2020) (tracing the history of “Hindutva”—defined as “an ideology that encompasses a wide range of forms, from violent, paramilitary fringe groups, to organizations that advocate the restoration of Hindu ‘culture’, to mainstream political parties”—and finding that it has become mainstream since 2014 under Modi); Ariel Goldstein, Brazil Leads the Third Wave of the Latin American Far Right , Ctr. for Rsch. on Extremism (Mar. 1, 2021), https://perma.cc/4PCT-NLQJ (discussing right-wing extremism in Brazil under Bolsonaro); Seth G. Jones, The Rise of Far-Right Extremism in the United States , Ctr. for Strategic & Int’l Stud. (Nov. 2018), https://perma.cc/983S-JUA7 (discussing right-wing extremism in the U.S. under Trump).
  • 14 Sergio Fausto, O Desafio Democrático [The Democratic Challenge], Piauí (Aug. 2022), https://perma.cc/474A-3849.
  • 15 Jan-Werner Muller, Populism and Constitutionalism , in The Oxford Handbook of Populism 590 (Cristóbal Rovira Kaltwasser et al. eds., 2017).
  • 16 Ming-Sung Kuo, Against Instantaneous Democracy , 17 Int’l J. Const. L. 554, 558–59 (2019); see also Digital Populism , Eur. Ctr. for Populism Stud., https://perma.cc/D7EV-48MV.
  • 17 Luís Roberto Barroso, Technological Revolution, Democratic Recession and Climate Change: The Limits of Law in a Changing World , 18 Int’l J. Const. L. 334, 349 (2020).
  • 18 For the use of social media, see Sven Engesser et al., Populism and Social Media: How Politicians Spread a Fragmented Ideology , 20 Info. Commc’n & Soc’y 1109 (2017). For attacks on the press, see WPFD 2021: Attacks on Press Freedom Growing Bolder Amid Rising Authoritarianism , Int’l Press Inst. (Apr. 30, 2021), https://perma.cc/SGN9-55A8. For attacks on the judiciary, see Michael Dichio & Igor Logvinenko, Authoritarian Populism, Courts and Democratic Erosion , Just Sec. (Feb. 11, 2021), https://perma.cc/WZ6J-YG49.
  • 19 Kuo, supra note 16, at 558–59; see also Digital Populism , supra note 16.
  • 20 Vicki C. Jackson, Knowledge Institutions in Constitutional Democracy: Reflections on “the Press” , 15 J. Media L. 275 (2022).
  • 21 Many of the ideas and information on this topic were collected in Luna van Brussel Barroso, Liberdade de Expressão e Democracia na Era Digital: O impacto das mídias sociais no mundo contemporâneo [Freedom of Expression and Democracy in the Digital Era: The Impact of Social Media in the Contemporary World] (2022), which was recently published in Brazil.
  • 22 The first industrial revolution is marked by the use of steam as a source of energy in the middle of the 18th century. The second started with the use of electricity and the invention of the internal combustion engine at the turn of the 19th to the 20th century. There are already talks of the fourth industrial revolution as a product of the fusion of technologies that blurs the boundaries among the physical, digital, and biological spheres. See generally Klaus Schwab, The Fourth Industrial Revolution (2017).
  • 23 Gregory P. Magarian, The Internet and Social Media , in The Oxford Handbook of Freedom of Speech 350, 351–52 (Adrienne Stone & Frederick Schauer eds., 2021).
  • 24 Wu, supra note 1, at 15.
  • 25 Journalistic ethics include distinguishing fact from opinion, verifying the veracity of what is published, having no self-interest in the matter being reported, listening to the other side, and rectifying mistakes. For an example of an international journalistic ethics charter, see Global Charter of Ethics for Journalists , Int’l Fed’n of Journalists (June 12, 2019), https://perma.cc/7A2C-JD2S.
  • 26 See, e.g. , New York Times Co. v. Sullivan, 376 U.S. 254 (1964).
  • 27 Balkin, supra note 2, at 2018.
  • 28 Magarian, supra note 23, at 351–52.
  • 29 Wu, supra note 1, at 15.
  • 30 Magarian, supra note 23, at 357–60.
  • 31 Niva Elkin-Koren & Maayan Perel, Speech Contestation by Design: Democratizing Speech Governance by AI , 50 Fla. State U. L. Rev. (forthcoming 2023).
  • 32 Thomas E. Kadri & Kate Klonick, Facebook v. Sullivan: Public Figures and Newsworthiness in Online Speech , 93 S. Cal. L. Rev. 37, 94 (2019).
  • 33 Elkin-Koren & Perel, supra note 31.
  • 34 Chris Meserole, How Do Recommender Systems Work on Digital Platforms? , Brookings Inst.(Sept. 21, 2022), https://perma.cc/H53K-SENM.
  • 35 Kris Shaffer, Data versus Democracy: How Big Data Algorithms Shape Opinions and Alter the Course of History xi–xv (2019).
  • 36 See generally Stuart Russell, Human Compatible: Artificial Intelligence and the Problem of Control (2019).
  • 37 Shaffer, supra note 35, at xi–xv.
  • 38 More recently, with the advance of neuroscience, platforms have sharpened their ability to manipulate and change our emotions, feelings and, consequently, our behavior in accordance not with our own interests, but with theirs (or of those who they sell this service to). Kaveh Waddell, Advertisers Want to Mine Your Brain , Axios (June 4, 2019), https://perma.cc/EU85-85WX. In this context, there is already talk of a new fundamental right to cognitive liberty, mental self-determination, or the right to free will. Id .
  • 39 Content moderation refers to “systems that classify user generated content based on either matching or prediction, leading to a decision and governance outcome (e.g. removal, geoblocking, account takedown).” Robert Gorwa, Reuben Binns & Christian Katzenbach, Algorithmic Content Moderation: Technical and Political Challenges in the Automation of Platform Governance , 7 Big Data & Soc’y 1, 3 (2020).
  • 40 Jack M. Balkin, Free Speech in the Algorithmic Society: Big Data, Private Governance, and New School Speech Regulation , 51 U.C. Davis L. Rev. 1149, 1183 (2018).
  • 41 See Carey Shenkman, Dhanaraj Thakur & Emma Llansó, Do You See What I See? Capabilities and Limits of Automated Multimedia Content Analysis 13–16 (May 2021),https://perma.cc/J9MP-7PQ8.
  • 42 See id. at 17–21.
  • 43 See Michael Wooldridge, A Brief History of Artificial Intelligence: What It Is, Where We Are, and Where We Are Going 63 (2021).

Perceptual hashing has been the primary technology utilized to mitigate the spread of CSAM, since the same materials are often repeatedly shared, and databases of offending content are maintained by institutions like the National Center for Missing and Exploited Children (NCMEC) and its international analogue, the International Centre for Missing & Exploited Children (ICMEC).

  • 45 Natural language understanding is undermined by language ambiguity, contextual dependence of words of non-immediate proximity, references, metaphors, and general semantics rules. See Erik J. Larson, The Myth of Artificial Intelligence: Why Computers Can’t Think the Way We Do 52–55 (2021). Language comprehension in fact requires unlimited common-sense knowledge about the actual world, which humans possess and is impossible to code. Id . A case decided by Facebook’s Oversight Board illustrates the point: the company’s predictive filter for combatting pornography removed images from a breast cancer awareness campaign, a clearly legitimate content not meant to be targeted by the algorithm. See Breast Cancer Symptoms and Nudity , Oversight Bd. (2020), https://perma.cc/U9A5-TTTJ. However, based on prior training, the algorithm removed the publication because it detected pornography and was unable to factor the contextual consideration that this was a legitimate health campaign. Id .
  • 46 See generally Adriano Koshiyama, Emre Kazim & Philip Treleaven, Algorithm Auditing: Managing the Legal, Ethical, and Technological Risks of Artificial Intelligence, Machine Learning, and Associated Algorithms , 55 Computer 40 (2022).
  • 47 Elkin-Koren & Perel, supra note 31.
  • 48 Evelyn Douek, Governing Online Speech: From “Posts-as-Trumps” to Proportionality and Probability , 121 Colum. L. Rev. 759, 791 (2021).
  • 53 See Martha Minow, Saving the Press: Why the Constitution Calls for Government Action to Preserve Freedom of Speech 20 (2021). For example, the best-selling newspaper in the world, The New York Times , ended the year 2022 with around 10 million subscribers across digital and print. Katie Robertson, The New York Times Company Adds 180,000 Digital Subscribers , N.Y. Times (Nov. 2, 2022), https://perma.cc/93PF-TKC5. The Economist magazine had approximately 1.2 million subscribers in 2022. The Economist Group, Annual Report 2022 24 (2022), https://perma.cc/9HQQ-F7W2. Around the world, publications that reach one million subscribers are rare. These Are the Most Popular Paid Subscription News Websites , World Econ. F. (Apr. 29, 2021), https://perma.cc/L2MK-VPNX.
  • 54 Lawrence Lessig, They Don’t Represent Us: Reclaiming Our Democracy 105 (2019).
  • 55 Essential Facebook Statistics and Trends for 2023 , Datareportal (Feb. 19, 2023), https://perma.cc/UH33-JHUQ.
  • 56 YouTube User Statistics 2023 , Glob. Media Insight (Feb. 27, 2023), https://perma.cc/3H4Y-H83V.
  • 57 Brian Dean, WhatsApp 2022 User Statistics: How Many People Use WhatsApp , Backlinko (Jan. 5, 2022), https://perma.cc/S8JX-S7HN.
  • 58 Confirmation bias, the tendency to seek out and favor information that reinforces one’s existing beliefs, presents an obstacle to critical thinking. Sachin Modgil et al., A Confirmation Bias View on Social Media Induced Polarisation During COVID-19 , Info. Sys. Frontiers (Nov. 20, 2021).
  • 59 Minow, supra note 53, at 2.
  • 60 Id. at 3, 11.
  • 61 On the importance of the role of the press as an institution of public interest and its “crucial relationship” with democracy, see id. at 35. On the press as a “knowledge institution,” the idea of “institutional press,” and data on the loss of prestige by newspapers and television stations, see Jackson, supra note 20, at 4–5.
  • 62 See , e.g. , Jack M. Balkin, How to Regulate (and Not Regulate) Social Media , 1 J. Free Speech L. 71, 89–96 (2021).
  • 63 By possible truth we mean that not all claims, opinions and beliefs can be ascertained as true or false. Objective truths are factual and can thus be proven even when controversial—for example, climate change and the effectiveness of vaccines. Subjective truths, on the other hand, derive from individual normative, religious, philosophical, and political views. In a pluralistic world, any conception of freedom of expression must protect individual subjective beliefs.
  • 64 Eugene Volokh, In Defense of the Marketplace of Ideas/Search for Truth as a Theory of Free Speech Protection , 97 Va. L. Rev. 595, 595 (May 2011).
  • 66 Steven J. Heyman, Free Speech and Human Dignity 2 (2008).
  • 67 A Global Dialogue to Guide Regulation Worldwide , UNESCO (Feb. 23, 2023), https://perma.cc/ALK8-HTG3.
  • 68 Can We Fix What’s Wrong with Social Media? , Yale L. Sch. News (Aug. 3, 2022), https://perma.cc/MN58-2EVK.
  • 69 Lessig, supra note 54, at 105.
  • 71 See supra Part III.B.
  • 72 Doeuk, supra note 48, at 804–13; see also John Bowers & Jonathan Zittrain, Answering Impossible Questions: Content Governance in an Age of Disinformation , Harv. Kennedy Sch. Misinformation Rev. (Jan. 14, 2020), https://perma.cc/R7WW-8MQX.
  • 73 Daphne Keller, Systemic Duties of Care and Intermediary Liability , Ctr. for Internet & Soc’y Blog (May 28, 2020), https://perma.cc/25GU-URGT.
  • 75 Decreto No. 12.965, de 23 de abril de 2014, Diário Oficial da União [D.O.U.] de 4.14.2014 (Braz.) art. 19. In order to ensure freedom of expression and prevent censorship, providers of internet applications can only be civilly liable for damages resulting from content generated by third parties if, after specific court order, they do not make arrangements to, in the scope and technical limits of their service and within the indicated time, make unavailable the content identified as infringing, otherwise subject to the applicable legal provisions. Id .
  • 76 Id. art. 21. The internet application provider that provides content generated by third parties will be held liable for the violation of intimacy resulting from the disclosure, without authorization of its participants, of images, videos, or other materials containing nude scenes or private sexual acts when, upon receipt of notification by the participant or its legal representative, fail to diligently promote, within the scope and technical limits of its service, the unavailability of this content. Id .
  • 77 Balkin, supra note 2, at 2017.
  • 78 Kate Klonick, The New Governors: The People, Rules, and Processes Governing Online Speech , 131 Harv. L. Rev. 1598, 1603 (2018).
  • 79 Transparency Reporting Index, Access Now (July 2021), https://perma.cc/2TSL-2KLD (cataloguing transparency reporting from companies around the world).
  • 80 Hum. Rts. Comm., Rep. of the Special Rapporteur on the promotion and protection of the right to freedom of opinion and expression, ¶¶ 63–66, U.N. Doc A/HRC/32/35 (2016).
  • 81 Paddy Leerssen, The Soap Box as a Black Box: Regulating Transparency in Social Media Recommender Systems , 11 Eur. J. L. & Tech. (2020).
  • 82 Daphne Keller, Some Humility About Transparency , Ctr. for Internet & Soc’y Blog (Mar. 19, 2021), https://perma.cc/4Y85-BATA.
  • 83 Mark MacCarthy, Transparency Requirements for Digital Social Media Platforms: Recommendations for Policy Makers and Industry , Transatlantic Working Grp. (Feb. 12, 2020).
  • 84 2022 O.J. (L 277) 1 [hereinafter DSA].
  • 85 The DSA was approved by the European Parliament on July 5, 2022, and on October 4, 2022, the European Council gave its final acquiescence to the regulation. Digital Services: Landmark Rules Adopted for a Safer, Open Online Environment , Eur. Parliament (July 5, 2022), https://perma.cc/BZP5-V2B2. The DSA increases transparency and accountability of platforms, by providing, for example, for the obligation of “clear information on content moderation or the use of algorithms for recommending content (so-called recommender systems); users will be able to challenge content moderation decisions.” Id .
  • 86 MacCarthy, supra note 83, 19–24.
  • 87 To this end, American legislators recently introduced a U.S. Congressional bill that proposes a model for conducting research on the impacts of digital communications in a way that protects user privacy. See Platform Accountability and Transparency Act, S. 5339, 117th Congress (2022). The project mandates that digital platforms share data with researchers previously authorized by the Federal Trade Commission and publicly disclose certain data about content, algorithms, and advertising. Id .
  • 88 Yifat Nahmias & Maayan Perel, The Oversight of Content Moderation by AI: Impact Assessment and Their Limitations , 58 Harv. J. on Legis. 145, 154–57 (2021).
  • 89 Auditing Algorithms: The Existing Landscape, Role of Regulator and Future Outlook , Digit. Regul. Coop. F. (Sept. 23, 2022), https://perma.cc/7N6W-JNCW.
  • 90 See generally Koshiyama et al., supra note 46.
  • 91 In Article 37, the DSA provides that digital platforms of a certain size should be accountable, through annual independent auditing, for compliance with the obligations set forth in the Regulation and with any commitment undertaken pursuant to codes of conduct and crisis protocols.
  • 92 Digit. Regul. Coop. F., supra note 89.
  • 93 In a transparency report published at the end of its first year of operation, the Oversight Board highlighted the inadequacy of the explanations presented by Meta on the operation of a system known as cross-check, which apparently gave some users greater freedom on the platform. In January 2022, Meta explained that the cross-check system grants an additional degree of review to certain content that internal systems mark as violating the platform’s terms of use. Meta submitted a query to the Board on how to improve the functioning of this system and the Board made relevant recommendations. See Oversight Board Published Policy Advisory Opinion on Meta’s Cross-Check Program , Oversight Bd. (Dec. 2022), https://perma.cc/87Z5-L759.
  • 94 Evelyn Douek, Content Moderation as Systems Thinking , 136 Harv. L. Rev. 526, 602–03 (2022).
  • 95 The illicit nature of child pornography is objectively apprehended and does not implicate the same subjective considerations that the other referenced categories entail. Not surprisingly, several databases have been created to facilitate the moderation of this content. See Ofcom, Overview of Perceptual Hashing Technology 14 (Nov. 22, 2022), https://perma.cc/EJ45-B76X (“Several hash databases to support the detection of known CSAM exist, e.g. the National Center for Missing and Exploited Children (NCMEC) hash database, the Internet Watch Foundation (IWF) hash list and the International Child Sexual Exploitation (ICSE) hash database.”).
  • 97 Jonathan Haidt, Why the Past 10 Years of American Life Have Been Uniquely Stupid , Atlantic (Apr. 11, 2022), https://perma.cc/2NXD-32VM.
  • All Stories
  • Journalists
  • Expert Advisories
  • Media Contacts
  • X (Twitter)
  • Arts & Culture
  • Business & Economy
  • Education & Society
  • Environment
  • Law & Politics
  • Science & Technology
  • International
  • Michigan Minds Podcast
  • Michigan Stories
  • 2024 Elections
  • Artificial Intelligence
  • Abortion Access
  • Mental Health

Hate speech in social media: How platforms can do better

  • Morgan Sherburne

With all of the resources, power and influence they possess, social media platforms could and should do more to detect hate speech, says a University of Michigan researcher.

Libby Hemphill

Libby Hemphill

In a report from the Anti-Defamation League , Libby Hemphill, an associate research professor at U-M’s Institute for Social Research and an ADL Belfer Fellow, explores social media platforms’ shortcomings when it comes to white supremacist speech and how it differs from general or nonextremist speech, and recommends ways to improve automated hate speech identification methods.

“We also sought to determine whether and how white supremacists adapt their speech to avoid detection,” said Hemphill, who is also a professor at U-M’s School of Information. “We found that platforms often miss discussions of conspiracy theories about white genocide and Jewish power and malicious grievances against Jews and people of color. Platforms also let decorous but defamatory speech persist.”

How platforms can do better

White supremacist speech is readily detectable, Hemphill says, detailing the ways it is distinguishable from commonplace speech in social media, including:

  • Frequently referencing racial and ethnic groups using plural noun forms (whites, etc.)
  • Appending “white” to otherwise unmarked terms (e.g., power)
  • Using less profanity than is common in social media to elude detection based on “offensive” language
  • Being congruent on both extremist and mainstream platforms
  • Keeping complaints and messaging consistent from year to year
  • Describing Jews in racial, rather than religious, terms

“Given the identifiable linguistic markers and consistency across platforms, social media companies should be able to recognize white supremacist speech and distinguish it from general, nontoxic speech,” Hemphill said.

The research team used commonly available computing resources, existing algorithms from machine learning and dynamic topic modeling to conduct the study.

“We needed data from both extremist and mainstream platforms,” said Hemphill, noting that mainstream user data comes from Reddit and extremist website user data comes from Stormfront.

What should happen next?

Even though the research team found that white supremacist speech is indentifiable and consistent—with more sophisticated computing capabilities and additional data—social media platforms still miss a lot and struggle to distinguish nonprofane, hateful speech from profane, innocuous speech.

“Leveraging more specific training datasets, and reducing their emphasis on profanity can improve platforms’ performance,” Hemphill said.

The report recommends that social media platforms: 1) enforce their own rules; 2) use data from extremist sites to create detection models; 3) look for specific linguistic markers; 4) deemphasize profanity in toxicity detection; and 5) train moderators and algorithms to recognize that white supremacists’ conversations are dangerous and hateful.

“Social media platforms can enable social support, political dialogue and productive collective action. But the companies behind them have civic responsibilities to combat abuse and prevent hateful users and groups from harming others,” Hemphill said. “We hope these findings and recommendations help platforms fulfill these responsibilities now and in the future.”

More information:

  • Report: Very Fine People: What Social Media Platforms Miss About White Supremacist Speech
  • Related: Video: ISR Insights Speaker Series: Detecting white supremacist speech on social media
  • Podcast: Data Brunch Live! Extremism in Social Media

University of Michigan Logo

412 Maynard St. Ann Arbor, MI 48109-1399 Email [email protected] Phone 734-764-7260 About Michigan News

  • Engaged Michigan
  • Global Michigan
  • Michigan Medicine
  • Public Affairs

Publications

  • Michigan Today
  • The University Record

Office of the Vice President for Communications © 2024 The Regents of the University of Michigan

Home / Essay Samples / Sociology / Hate Speech / Hate Speech On Social Media

Hate Speech On Social Media

  • Category: Sociology , Entertainment
  • Topic: Hate Speech , Social Media

Pages: 1 (616 words)

Views: 2999

  • Downloads: -->

--> ⚠️ Remember: This essay was written and uploaded by an--> click here.

Found a great essay sample but want a unique one?

are ready to help you with your essay

You won’t be charged yet!

Lottery Essays

Anthem Essays

A Beautiful Mind Essays

Miss Representation Essays

Boyz N The Hood Essays

Related Essays

We are glad that you like it, but you cannot copy from our website. Just insert your email and this sample will be sent to you.

By clicking “Send”, you agree to our Terms of service  and  Privacy statement . We will occasionally send you account related emails.

Your essay sample has been sent.

In fact, there is a way to get an original essay! Turn to our writers and order a plagiarism-free paper.

samplius.com uses cookies to offer you the best service possible.By continuing we’ll assume you board with our cookie policy .--> -->