Crowdsourcing social circle discovery on Twitter

In a short paper accepted to the 9th International Conference on Web and Social Media (ICWSM 2015), we explored how we can use the lists already created on Twitter to organize content for new users. A list is a way of organizing contacts and content on Twitter; for example, a user might create a list “Data Science” to include the accounts of prominent data-scientists like Andrew Ng and Hilary Mason.  The functionality is available to all users, who can create a list by coming up with a title and selecting list members from the pool of all twitter users. Once a list is created, it can be used to selectively view the tweets of its members. Rather than making every user start from scratch when organizing their friends, we wanted to find a way to use the lists already created by other users to recommend groupings automatically. Continue reading

Frequently asked questions about malware

In the post-Snowden era, computer security and privacy are becoming a growing concern for the Internet users. At the same time, the Internet of Things (IoT) is emerging, in which more and more devices become interconnected. Still, most users have little knowledge of how they could protect themselves online.

Before returning to grad school, I had the privilege of working for a few years in the labs of F-Secure, one of the top 3 data-security companies in the field of malware (malicious software) fighting. Collaborating with some of the world’s top experts in the field has certainly been very exciting. In this post, I attempt to answer some very common and basic questions regarding computer malware. The following list of questions is by no means supposed to be exhaustive. It only aims to get across a few basic and necessary facts. Continue reading

How does Shakespeare compare against modern rap artists?

A couple of months ago Eric Malmi wrote about his work on Raplyzer, a method for analyzing Finnish rap lyrics. With the use of a speech synthesizer, Eric has now extended the method to English rap lyrics. Using the new version of the analyzer, he ranked 94 rap artists based on their rhyme factor, and even threw Shakespeare in the mix. He describes the results in a new blog post.

artists_en_20k_feb20

Additionally, if you are looking for more action, you may want to battle rap against BattleBot.

Finding similar neighborhoods across cities

By Geraud Le Falher, Michael Mathioudakis, and Aris Gionis

Our friend Oliver lives in London, where he works as a consultant for a big financial company. Occasionally, he takes a business trip to another major city, to seal a major deal, and make major buck for his boss. Of course, Oliver being Oliver, he always finds the time to enjoy whatever city he happens to be. When in London, he likes to suit up and hang out in Soho, a “predominantly fashionable district of upmarket restaurants.” He would like to do that also in Rome, where he’s flying to next week, but he doesn’t know much about that city. Where is the Soho of Rome? What neighborhood of Rome is most similar to Soho? Continue reading

Overlapping community detection in labeled graphs

In a recent paper with Esther Galbrun and Nikolaj Tatti, presented in the journal of Data Mining and Knowledge Discovery, we worked on the problem of discovering overlapping communities in networks with labeled vertices. The model is motivated by social networks, where vertex labels are used to represent information about individuals, such as occupation, hobbies, preferences, etc. The hypothesis is that the vertex labels can be used to derive and explain the community structure in the network. Continue reading

From “I love you babe” to “leave me alone” — romantic relationship breakups on twitter

Our recent paper (together with Ingmar Weber and Sonia Dal Cin) studying Romantic Relationship Breakups on Twitter was accepted at the 6th International Conference on Social Informatics (SocInfo 2014). In the paper, we identified pairs of Twitter users who were in a romantic relationship and studied various psychological processes surrounding relationship dissolution. Continue reading

How graph algorithms can help to find interesting events

In our recent KDD paper, with Polina Rozenshtein, Aris Anagnostopoulos, and Nikolaj Tatti, we worked on the problem of finding events in graphs. We abstracted the event-finding problem with the following simple formulation: Given a graph with node weights and edge distances, find a subset of nodes (the event) that have large sum of weights and are well connected. In the paper we showed how to use this formulation to find interesting events in real-world datasets. Continue reading