Several people have expressed their concern, lately, about high levels of polarization in society. For example, the World Economic Forum’s report on global risks lists the increasing societal polarization as a threat – and others have suggested that social media might be contributing to this phenomenon.
In a recent paper, published at the Tenth International Conference on Web Search and Data Mining (WSDM 2017), we build algorithmic techniques to mitigate the rising polarization by connecting people with opposing views – and evaluate them on Twitter.
In more detail, our approach is to model user interactions around a given topic (e.g., Obamacare or US elections) on Twitter as an endorsement graph. Nodes of the graph represent Twitter users. An edge between two nodes, say A and B, indicates that one user has retweeted the other (i.e., user A has reposted a tweet generated by B on the topic). One commonly observed feature of such graphs is that, for controversial topics, the structure of the graph is strongly biclustered – with each cluster corresponding to users who make up one of two sides of a controversy (e.g., in favor of and against Obamacare, or Democrat and Republican — see figure below). The main task addressed in the paper is to suggest edges between users of opposing sides, so as to effectively minimize the polarization represented in the endorsement graph. Such edges can then be used, for example, to provide users with recommendations about who to retweet.
While there exist studies that aim to connect opposing viewpoints, they are mostly limited to small-scale surveys or manually-curated recommendations. We are the first to propose a thoroughly algorithmic solution, which can be applied on a large-scale and is language- and domain-independent.
At a more technical level of detail, we make use of the Random Walk Controversy (RWC) score defined in our earlier work that appeared at WSDM’16. The objective is to find the k best edges to add to a polarized network so as to reduce the RWC score the most.
Our main algorithm is based on the finding that for a special type of network simulating a polarized network, the best edges we can add to the network are between the nodes with the highest degrees on either side.
Since we are dealing with retweet networks, high-degree users usually are the ones who are well known and have many followers. For instance, in the case of US politics, the two sides would be the democrats (left) and republicans (right) and the highest degree users would be Obama and Trump on either side, respectively. It is not practical to recommend that Obama follow Trump, so even though in theory these are the best edges, they might not materialize in the real world.
To overcome this fact, we also include a factor that considers how likely a user is to accept a recommendation into our model. We compute this as the probability that a user will endorse a recommendation, based on the data. After we take this factor into account, the two recommendations produced are more meaningful and likely to materialize, for instance, as shown in the table below (for two of our datasets, obamacare and guncontrol). Vertex 1 and vertex2 are the two ends of the edge added, so we are recommending vertex 2 to vertex 1. The top shows the recommendations with out taking into account the acceptance probability.
Since we are adding k edges, we need to recompute the RWC score after each edge is added. Computing this score is a costly operation, especially for large graphs. We propose a method to compute the incremental score based on the fact that the change in graph only depends on one vertex, using the Sherman Morrison formula.
We compare our approach with other existing approaches on edge addition and show that our approach performs better than existing ones, both in terms of bringing the two sides closer and the time taken.
For more details, please refer to the paper here.