The Future of Detecting Extreme-right Sentiment Online

By Tiana Gaudette, Ryan Scrivens, and Garth Davies

Since the advent of the Internet, far-right extremists – amongst other extremist movements – from across the globe have exploited online resources to build a transnational ‘virtual community’. The Internet is a fundamental medium that facilitates these radical communities, not only in ‘traditional’ hate sites such as Stormfront, but in commonly used social media sites such as Twitter, Facebook, YouTube, and Reddit. Researchers and practitioners have attempted to identify and monitor extreme-right content online but have been overwhelmed by the sheer volume of data in these growing spaces; simply put, the manual analysis of online content has become increasingly less feasible.

As a result, researchers and practitioners have sought to develop different methods of managing this ‘big data’ phenomenon to sift through and detect extremist content. A relatively novel machine learning tool, sentiment analysis, has sparked the interest of some researchers in the field of terrorism and extremism studies who are faced with new challenges in detecting the spread of far-right extremism online. Though this area of research is in its infancy, sentiment analysis is showing signs of success and may represent the future of how researchers and practitioners study extremism online – particularly on a large scale.

Sentiment analysis and right-wing extremism online

Sentiment analysis, also known as ‘opinion mining’, has become increasingly popular in terrorism and extremism studies because, as the amount of ‘opinionated data’ online grows exponentially, sentiment analysis software offers a wide range of applications that can address previously untapped and challenging research problems. This software can identify and classify opinions found in a piece of text through a two-step process that produces a ‘polarity value’:

  1. A body of text is split into sections (sentences) to determine subjective and objective content, and
  2. Subjective content is classified by the software as being either positive, neutral, or negative.

A number of researchers who study how terrorists and extremists use the Internet have turned to sentiment analysis to find content of interest on a large scale, but a large proportion of this work has focused on Jihadists rather than the radical right.

Scholars, for example, have used sentiment analysis to detect radical content and users on Islamic-based discussion forums and evaluate the communications found on Jihadi forums, as well as measure the evolution of the sentiment found in Dabiq through text analysis of the content in the magazine.

Other researchers have used sentiment analysis to study people’s reactions to a terrorist attack on Twitter, examine the opinions – and geolocations – found on Twitter accounts using hashtags associated with the so-called “Islamic State” (IS), and identify extremist content on Twitter using hashtags associated with IS. Other scholars have combined sentiment analysis with social network analysis to identify users on YouTube who may have a Jihadi radicalizing agenda.

In terms of the radical right, sentiment analysis has been combined with other techniques to analyze the discourse on hate forums and evaluate how cyberhate develops on Twitter following a terrorist attack. Others have drawn on sentiment analysis tools to differentiate large volumes of content found in extreme right-wing sites and forums from the content found on news sites and counter-extremist sites.

The future of sentiment analysis and right-wing extremism online

Drawing from the recommendations of previous studies, combined with our own experience, we suggest that the future of sentiment analysis in detecting and measuring right-wing extremist content online should: (1) consider a combination of analyses or more features to increase classifier’s effectiveness, and (2) continue to validate machine learning tools.

First, research has shown that combining machine learning with other techniques and/or semantic-oriented approaches improves the detection of extremist content – particularly right-wing extremist content. For example, affect analysis, in combination with sentiment analysis, may be helpful in identifying the intensity levels associated with a broad range of emotions in text found in online spaces of the extreme-right. Not only that, but the effectiveness of the classifiers in detecting extremist content is significantly boosted with additional feature sets – for example, syntactic, stylistic, content-specific, and lexicon features. In addition, combining sentiment analysis with temporal analyses (to note but one example) could be used to measure radical online discourse over time and identify significant spikes that coincide with real-world events.

Second, a common thread that binds the aforementioned studies is the need to assess and potentially improve the classification accuracy and content identification offered by sentiment analysis. Researchers, for example, have proposed that future work include a ‘comparative human evaluation’ component to validate a sentiment program’s classifications. This technique would have humans rate opinions in sentences and compare the results to a sentiment analysis program. By comparing how a human might classify a piece of text to a sentiment analysis program, researchers can get better insight into the accuracy of sentiment analysis’ classifications. Also, it is not yet clear which sentiment analysis program is the most accurate or effective overall in detecting extreme-right content online. Future work should continue to explore and test the wide variety of programs currently available to determine if there is indeed one ‘superior’ method, or if the appropriate methodology is context-specific.

Research on detecting far-right extremist content will undoubtedly grow. Far-right extremists have proven to be at least as likely as jihadists to learn and communicate on the Web. At the same time, a number of political events, including the pronouncements and policies of the Trump administration in the U.S. and the immigration and refugee crises being experienced across the West are providing fuel for far-right discourse. This discourse, in turn, further creates facilitating contexts for far-right violence, rendering more efficient means of identifying such content a matter of increasing urgency.

Tiana Gaudette is an MA student in the School of Criminology at Simon Fraser University.

Ryan Scrivens is a Horizon Postdoctoral Research Fellow at Concordia University and a VOX-Pol visiting researcher. Follow him on Twitter: @r_scrivens.

Garth Davies is an associate professor in the School of Criminology at Simon Fraser University and the Co-director of the Terrorism, Risk, and Security Studies Professional Master’s Program at SFU.