The Ebb and Flow of Controversial Debates on Social Media

By Kiran Garimella and Michael Mathioudakis

Our recent paper titled ‘The Effect of Collective Attention on Controversial Debates on Social Media’ (arXiv link) won the best student paper award at the 9th ACM Web Science conference held in Troy, New York.

The paper studies the evolution of long-lived controversial debates on Twitter – i.e., discussions on topics such as ‘gun control’ or ‘abortion’, that reveal a split of opinion between people who support different sides of the argument.

The main goal of this work is to study dynamic aspects of controversial debates — in particular: (i) whether controversy around the debates has increased over time; and (ii) whether controversy increases or decreases when major associated events occur.


The dataset consists of an 1% sample of Twitter of all tweets generated between September 2011 and September 2016, as published by Twitter and stored on the Internet Archive (link). For the purposes of the study, we focus on subsets of tweets related to major controversial topics in the USA, including Obamacare, Abortion, and Gun Control.

Measuring Controversy

For each topic in the study, we measure the controversy surrounding the topic for each day spanned by the dataset. To do so, we employ the Random Walk Controversy (RWC) method we developed in earlier work [1]. The RWC score essentially quantifies the degree to which the retweet network of a given topic and day is polarized – and, the higher the RWC score, the higher the controversy around the topic. For more details on the RWC score, we refer the interested reader to the full paper [1].

Controversy over Time

Having obtained a controversy score for each topic and day in the dataset, we can now ask whether controversy has increased over the five years covered in the dataset.

The answer to this question is shown in the plot below. The X-axis of the plot spans time at daily granularity, from September 2011 to September 2016; and the Y-axis spans values of the RWC score.


As we see from the figure, even though RWC appears to fluctuate over time, there is no clear trend for increasing or decreasing controversy over time.

Controversy and Collective Attention

Even so, we wish to understand better the fluctuations of controversy over time. Our hypothesis is that the level of controversy around a controversial topic increases or decreases with the collective attention attracted by the topic. In plain terms, we hypothesized that, when a controversial debated was making headlines, the level of controversy around it would increase. For instance,

To test that hypothesis, we follow two steps.

Firstly, we quantified collective attention of a topic a given day as the number of users who post a tweet on that day. As we see in the figure below, this level of attention coincides well with the occurrence of important events related to the topics.

Screen Shot 2017-04-27 at 20.11.01

Secondly, we juxtapose RWC score with Collective attention, as measured at daily granularity. The results are shown in the figure below. Larger values on the X-axis of the plots correspond to higher levels of collective attention, and larger values on the Y-axis correspond to higher levels of RWC score.


The figures reveal a clear trend: the higher the level of collective attention on a controversial topic, the larger the controversy as measured by the RWC score.

It is important to note that this trend was not observed for non-controversial topics.

Other Measures and Future Work

In addition to the discussion above, the full paper studies the behavior of other network- and content-based measures over time.

With this work, we dived deeper into the study of controversial debates and the complex interactions they encompass. In future work, we plan to study the interplay between controversy and echo chamber phenomena.

Stay tuned!

[1] Kiran Garimella, Gianmarco De Francisci Morales, Aristides Gionis, and Michael Mathioudakis. 2016. Quantifying Controversy in Social Media. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining (WSDM ’16). ACM, New York, NY, USA, 33-42. DOI:




Reducing Controversy by Connecting Opposing Views

Several people have expressed their concern, lately, about high levels of polarization in society. For example, the World Economic Forum’s report on global risks lists the increasing societal polarization as a threat – and others have suggested that social media might be contributing to this phenomenon.

In a recent paper, published at the Tenth International Conference on Web Search and Data Mining (WSDM 2017), we build algorithmic techniques to mitigate the rising polarization by connecting people with opposing views – and evaluate them on Twitter.

In more detail, our approach is to Continue reading

Extracting Skills from Personal Communication Data using StackExchange Dataset

This blog post is a summary of our published work at ACM CIKM. The project is about automatically profiling the skills of users by analyzing their personal communication data. We considered this as a prediction problem, given the messages of the user we had to predict the skills of the user. We made of use of the stack exchange dataset which is freely available here, as a training set. There are many stackexchange websites like stackoverflow, cs, datascience, physics, history and so on. This dataset covers a diverse set of skills and will be automatically updated if new technologies come to the fore.

Continue reading

Using Instagram images to monitor public health

Our recent paper on ‘Social media image analysis for public health‘ will appear as a  short paper in CHI 2016. The question we ask in this paper is whether images uploaded to social media can be used to predict public health variables and lifestyle diseases, such as obesity, diabetes, depression, etc.

Lifestyle diseases are of major concern in the developed world. NYTimes estimates that in addition to costing almost a trillion dollars, lifestyle diseases kill more people than contagious diseases. With the ubiquitous use of social-media platforms in the recent years, it has never been easier to collect and analyze lifestyle choices of large populations. For this reason, social-media data has indeed been used in the past to study or monitor public health. Continue reading

Quantifying controversy on social media

Controversies are everywhere on social media. Studying and understanding the structure and evolution of these controversies is an important area of research. Though there have been previous studies that try to study controversy on social media, they are either too domain specific (e.g., politics) or need prior labeled data.

To address these shortcomings, in our recent WSDM 2016 paper, we designed a fully automatic way to detect ad-hoc controversial issues in the wild, with no prior information or domain knowledge. We represent a topic of discussion with a conversation graph. In this graph vertices represent people and edges represent conversation activity, such as posts, comments, mentions, or endorsements. Our goal is to examine if there are distinguishable patterns in the way conversations are shaped during a controversial event.

Continue reading

Scalable facility location for massive graphs on pregel-like systems

Our paper on designing a distributed algorithm for solving the facility-location problem was accepted at the CIKM 2015 conference, and will be presented in Melbourne next week.

What is the facility-location problem? Facility location is a classic problem, first studied in the field of operations research. In the problem setting, we are given a set of ‘facilities’ and a set of ‘locations’ and the goal is to find a mapping of the locations to the facilities such that a certain objective function is minimized. The objective function models the operating cost of serving the locations with a set of selected facilities, and it includes two terms: a cost term for opening a new facility, and a cost term for serving a location with an open facility. Continue reading