Commuter Traffic: Can Big Data solve the problem?

AI of Things    23 November, 2016
When we sit in our daily traffic jams, many of us may think: Where do the other commuters come from? Where are they on their way to? Are we all going in the same direction? Perhaps the lady who sat in the car next door actually lives 2 doors away and has the exact same commute as me everyday.  Here at LUCA, we decided to take a data-driven approach by looking at our mobile data insights to show you the huge potential of carsharing, demonstrating that us commuters have a lot more in common than you may think.
Smart Steps is part of the LUCA portfolio and enables us to extract actionable mobility insights from our customers mobile event data. After anonymizing and aggregating the data, we are able to understand the demographic profile of groups of mobile phones (which act as a proxy for groups of people) as well as identifying their home and work locations:
In this analysis  we decided to use Smart Steps to have a social impact, looking at how our product could contribute to sustainability goals. First of all, we perform pre-processing on our anonymized raw data in order to provide useful information in a recurrent way. An example of this is the assignment of POIs (points of interest) which can be either “work” or “home”.
As you can imagine, these POIs are calculated based on mobility patterns. So, we began to think about use cases related to the sharing economy, public transport planning, environmental impact and infrastructure construction. In the end, we decided to focus our analysis on a typical day in Madrid, which suffers tremendous traffic jams in rush hour (as in many cities) and has recently been affected by considerable pollution issues.

Deciding how to address such problems is a real challenge for the authorities, so we have developed a simple tool which explores how many people share both their home postcode and their work postcode.

Where do we work?

First of all, we extracted a heat map showing the density of workers in every postcode in Madrid. Below you can check out this map and compare to see if it is aligned with your expectations. The darker the colour, the greater the number of workers in this area:
We then decided to dig deeper by looking at the catchment of the Telefónica Headquarters, establishing where workers commute from every day. We expand on the heat map below in our demo video here in case you find this interesting:
Figure 4: Where do people commute from to the Telefónica Headquarters postcode?

How do we move around the city?

Fully engaged by this part of the analysis, we decided to build a simple dashboard which is able to show the home-to-work relationships between all postcodes in Madrid. By doing so, we ended up creating a tool which could help us to, for example, find carsharing partners who share both home and work postcode on a daily basis, providing a unique opportunity for companies and the public sector to encourage this green initiative (we’ll elaborate on this next week in a more specific post on carsharing).
Not satisfied by this, we took the analysis one step further, looking in greater detail at the movements of masses every day. We grouped the postcodes in to four areas (1) Internal-North, (2) Internal-South, (3) External-North and (4) External-South, where “internal” refers to the area inside the M-30 motorway, and “external” refers to the rest of the Autonomous Community of Madrid. Based on these areas, we can see how masses move. In the video below, we explain how this dashboard works:

Another great way to understand and visualize how people move in Madrid is to use graph analytics. The commuting dataset contains an aggregated count of the number of people moving from home (postcode A) to work (postcode B). This can be seen as a directed graph where nodes are postcodes and edges are weighted by the count of moving people. We then performed some pre-processing and ran the community detection algorithm in Gephi to find out which groups of postcodes are intrinsically connected (communities). The algorithm produced 5 groups or communities which you can see below:
Figure 5: We used Gephi to detect 5 communities of highly connected postcodes.
Then, we can easily represent nodes and colors (communities) on a map using Spotfire:
As you can see in figure 6 and the second video, there are very clearly defined mobility areas, and in general, people appear to live and work in the same area (North, East, South, Central and West) as the colours are relatively compact.  This is beneficial for the quality of our air as it implies shorter journeys to work. However, there is an exception with the “blue” community (Madrid East also know as “Corredor de Henares”) as it shows a much sparser pattern than the others.

Another interesting approach is to investigate the “poles of attraction” of each community, that is, postcodes with the highest number of commuters within its community (the biggest circles in the Gephi graph and map in figure 5), which are really the “busy areas” of each area. This is also demonstrated in our second video.
Of course, our analysis is a relatively simple and straightforward approach to what could be a much more complete tool. There is plenty of fine tuning to be done, including greater capabilities and more extensive data which could complete the analysis. However, it is a first step in understanding how individuals, companies, NGOs, public administrations could use this data to improve our lives in an anonymized and aggregated way – prioritising security every step of the way.
After reading this, do you have further ideas on how to apply our data? Let us know by dropping us an email here or commenting on this post. We’d love to hear from you.
By Javier Carro and Pedro de Alarcón, Data Scientists at LUCA.


  1. There is great value mapping intrinsically connected communities.

    We did similar work with mobile data for the Estonian Ministry of Regional Development back in 2010: The question was whether development should be seen as something to be done within strict administrative borders. Or should people's own activities determine the functional regions that should be looked as a whole. This leads to a more focused effort on community investment – making sure that each functional community has enough hospitals, schools, extracurricular activity spaces, even shops.

    It is also important to look at this data separately by age group.

Leave a Reply

Your email address will not be published.