Who Let the DAGs Out?

Mining for Meaning

We did – in our paper titled “Beyond rankings: comparing directed acyclic graphs” (pdf) which I’ll be presenting at the ECML PKDD conference in Portugal next month. This was the first project of my PhD, but there’s also something else that makes it fundamentally different from the other research projects I’ve been involved with.

Typically, when I undertake a research project, I have a concrete question, like what is the next location a person will visit, to which I start looking for different solutions. In other words, I begin with a nail and start looking for a suitable hammer. However, this time we started by developing a cool new hammer with some neat theoretical properties before we had any idea if a suitable nail even exists.

View original post 1,088 more words

Apartment prices in Helsinki relate to accessibility by public transport

This post is shared with I.Ž. research blog.

Traditionally, apartment prices are considered to relate to the apartment characteristics and its location. We had a hypothesis that accessibility of a neighbourhood perhaps is even more important than its location. So we did a pilot study in Helsinki region to check that.

First we define static and dynamic points of interest in the city. Static points of interest are supposed to capture community centers. We find them by locating H&M stores in Helsinki region. Dynamic points of interest are supposed to capture where people go at different times of day. We find those centers by clustering FourSquare check-ins.

points_of_interest1bpoints_of_interest2b

Continue reading

Crowdsourcing social circle discovery on Twitter

In a short paper accepted to the 9th International Conference on Web and Social Media (ICWSM 2015), we explored how we can use the lists already created on Twitter to organize content for new users. A list is a way of organizing contacts and content on Twitter; for example, a user might create a list “Data Science” to include the accounts of prominent data-scientists like Andrew Ng and Hilary Mason.  The functionality is available to all users, who can create a list by coming up with a title and selecting list members from the pool of all twitter users. Once a list is created, it can be used to selectively view the tweets of its members. Rather than making every user start from scratch when organizing their friends, we wanted to find a way to use the lists already created by other users to recommend groupings automatically. Continue reading

Frequently asked questions about malware

In the post-Snowden era, computer security and privacy are becoming a growing concern for the Internet users. At the same time, the Internet of Things (IoT) is emerging, in which more and more devices become interconnected. Still, most users have little knowledge of how they could protect themselves online.

Before returning to grad school, I had the privilege of working for a few years in the labs of F-Secure, one of the top 3 data-security companies in the field of malware (malicious software) fighting. Collaborating with some of the world’s top experts in the field has certainly been very exciting. In this post, I attempt to answer some very common and basic questions regarding computer malware. The following list of questions is by no means supposed to be exhaustive. It only aims to get across a few basic and necessary facts. Continue reading

How does Shakespeare compare against modern rap artists?

A couple of months ago Eric Malmi wrote about his work on Raplyzer, a method for analyzing Finnish rap lyrics. With the use of a speech synthesizer, Eric has now extended the method to English rap lyrics. Using the new version of the analyzer, he ranked 94 rap artists based on their rhyme factor, and even threw Shakespeare in the mix. He describes the results in a new blog post.

artists_en_20k_feb20

Additionally, if you are looking for more action, you may want to battle rap against BattleBot.