Automation in Online Content Moderation: In Search of Lost Legitimacy and the Risks of Censorship

Want to submit a blog post? Click here.

By Charis Papaevangelou, Jiahong Huang, Lucia Mesquita, and Sara Creta

At a recent workshop, JOLT Early Stage Researchers (ESRs) worked in multi-disciplinary teams to develop ideas for research projects that address a major issue surrounding media and technology. As the European Commission prepares to announce its much-anticipated Digital Services Act, four ESRs review the lack of accountability in content moderation and propose to interrogate platform policies and to develop an alternative, human-rights based, evaluation system. 

Online platforms play a significant role in framing public discourse. Acting as infomediaries, they host and curate the exchange of third-party or user-generated content and their policies determine the visibility of that content. Given the transnational and instantaneous nature of Internet communications, platforms have the capacity to regulate content in the online environment while state power is more limited. In recent years, the major online platforms have adopted various transparency mechanisms to communicate the modus operandi of their content moderation processes and to regain some of their lost public trust and legitimacy. A prominent example of this phenomenon is the ‘transparency reports’, through which online platforms disclose information to public. For example, some reports relate to “state-backed information operations” that, according to the platforms, have attempted to manipulate the public conversation.

Platform have specific policies for content moderation – including removing content or reducing its visibility – as well as  a plethora of responsive mechanisms such as the use of warning messages or the implementation of temporary account suspensions. Platforms operationalise their content moderation practices under a complex set of nebulous rules and procedural opacity, while governments clamour for tighter controls on some material, and members of civil society demand greater freedoms for online expression.

Given the vast amount of data that these platforms have to deal with, it is reasonable to rely on automated technologies to police speech. To do so, the platforms are trying to define patterns of “coordinated and inauthentic behaviour” in order to distinguish users who exhibit unusual behaviour and may be bots or bad actors spreading misleading or inflammatory content. However, these algorithms are context-blind, leading to decisions that may have collateral damage for freedom of expression. The Internet has become dangerously centralised and platforms will, more often than not, opt for over-committing to removing and censoring content rather than analysing it in detail, due to time and state pressure: time pressure because of the virality effect and state pressure because of the fear for stricter regulations, which could impact not only their content moderation procedures but also their business model.

Moreover, while internet companies are attempting to regulate speech at a global scale, online platforms are, directly or indirectly, politically pressured to take action against ‘problem’ content. Consequently, this undoubtedly informs their content moderation decisions, by complying with requests from governments and law enforcement, which can have unfair or discriminatory effects on different groups, restrict free speech, and target political dissidents. Under these circumstances, it is important to consider key ethical and political issues including transparency in decision-making, the definition of just rules and redress policies, and the [re]politicisation of key concepts including freedom of expression.

Currently, only Twitter has made publicly available unhashed datasets on state-backed information operation. In contrast, other major platforms have only disclosed their data with a few privileged third-party researchers including the Stanford Internet Observatory or The Atlantic Council’s Digital Forensic Research Lab. However, these datasets remain largely inaccessible to independent researchers, which means public interest scrutiny is a scarce resource. In recent years, platforms have further restricted access to their public Application Programming Interfaces (APIs), making it nearly impossible to hold companies accountable for illegal or unethical behaviour.

Holding platforms accountable

Reviewing these issues during a recent JOLT workshop, we considered ways to challenge the platforms’ assumptions regarding networks of disinformation. We believe that the transparency reports reveal only a narrow perspective on content moderation and one that reflects the platforms’ biases as well as state pressure. In other words, we need to consider whether content-takedowns decisions are made on a fair and transparent basis or are they weaponised by governments to suppress political dissidence? States pressuring social media platforms, directly or indirectly, to engage in ex-ante automated content moderation has surely pushed platforms to remove more content than they normally would, to avoid getting penalised. Notably, the risk of “over-censoring” was one of the reasons that the French Institutional Court struck down most of the “law against hateful content online,” (broadly known as “Loi Avia”), that was proposed by the current French administration.

By delving deeper into the platforms’ available datasets, we propose to reassess the ‘big data divide’ in order to: (i) discern patterns that constitute “coordinate inauthentic” and “state-linked behaviour” to interrogate platforms’ algorithmic accountability to moderate content, and (ii) to develop our own evaluation system, which could be applied to future datasets to identify similar networks and patterns. To do so, we aim to develop a Machine-Learning algorithm, which will be trained using our own formulated metrics, building on existing datasets. By critically exploring the human rights dimension of the digital platform debate, we aim to suggest novel solutions for the new roles assumed by online platforms, identifying key challenges in the areas of intermediary liability and interrogate algorithmic accountability.

With large parts of our media and communication infrastructure governed by algorithmic systems, we need better tools to understand how these systems are impacting our democracies. While there is no single silver bullet to address all the challenges linked to the platform economy, we are convinced that the proposals outlined above serve as a critical baseline to demand improved accountability in the digital public sphere. The embeddedness of content moderation – along with its automation – in public policies and law enforcement authorities seems likely to increase, as the public discourse continues to largely play out on private online platforms.  There is a spectrum of responses that States can take to ensure appropriate protection of fundamental rights, ranging from “command and control” regulation to secondary liability regimes, co-regulation, and ultimately self-regulation: thus, we contend that the encouragement of platform responsibility through commitment and transparency mechanisms constitutes the least intrusive type of regulatory intervention.


This article was originally published on jolt.eu. JOLT is a Marie-Skłodowska-Curie European Training Network, which aims to harness digital and data technologies for journalism by providing a framework for the training and career development of 15 Early Stage Researchers. On Twitter @JOLT_EU.

Charis Papaevangelou (Université Toulouse III – Paul Sabatier), Jiahong Huang (University of Amsterdam), Lucia Mesquita and Sara Creta (Dublin City University) are Early Stage Researchers with the JOLT project.