#StayHome: is the call from governments, companies and individuals to all citizens to contribute to end the pandemic. However, since not everyone stays at home all the time, many questions arise: Do social distance measures increase or decrease population flow between territories? Are there areas with a greater crowding or inflow than others? What is the concentration of population in the different areas in relation to their health capacity?
In order to answer these questions, the Spanish Government has launched a study intended to measure the daily variation of population flows moving at the municipal or provincial level during the health crisis. It has been called DataCOVID-19. The power of Big Data is thus placed at the service of public health:
To analyse large volumes of information and extract useful conclusions, thus gaining in efficiency in evidence-based decision-making that is better coordinated and adapted to each region.
The Government Doesn’t Want to Track You Down and Fine You if You Don’t Comply with the Confinement
For many years, companies have been using Big Data and Business Intelligence for data-driven decision making. For example, LUCA (Telefónica’s Data Unit) has developed the Smart Steps platform. This platform analyses anonymised data on access to its network and generates aggregated insights on the global trends of groups of people, thus helping organisations to optimise their value proposal.
It is therefore not only not surprising at all, but to be welcomed, that the Spanish State also decides to use these powerful tools to make effective decisions based on data, without any risk to our freedom. This advanced use of data was precisely proposed by the Government last 27 March through the Spanish Orden SND/297/2020:
To entrust the Secretary of State for Digitalisation and Artificial Intelligence of the Ministry of Economic Affairs and Digital Transformation, following the model undertaken by the National Institute of Statistics in its mobility study and through data crossing of mobile operators, in an aggregated and anonymised way, the analysis of the mobility of people in the previous days and during the confinement.
However, some people interpreted this paragraph as an individual and personalised tracking of each and every citizen to monitor whether or not they were complying with the confinement and fine them accordingly. As expected, the hoaxes flooded the social networks with Orwellian messages: loss of freedom, violation of the right to privacy, espionage by the state in collusion with telephone operators, and much more.
Although it is now, in the middle of this coronavirus crisis, that this study has caused such a stir, actually this controversy comes from before. As explained by the Spanish Statistical Institute (INE) in a statement in October 2019 in relation to another controversial study of mobility on holiday travel:
Operators will not provide individual data on telephone numbers, nor on the owners of the lines, so in no case will the INE be able to track the location of any terminal.
How Then Can the INE Know Everything About the Movement of the Population as a Whole?
Surely you already knew that mobile phones communicate via antennas located in cells. Each antenna provides service to all terminals falling within its coverage area at any given time. In very populated areas, operators locate many and very close cells to provide service to the large number of subscribers covered by each one, at distances as small as 400 m, while in unpopulated areas the antennas can be up to 8 km away.
As a mobile phone user moves around, they switch from one cell to another. To make the mobile phone work, your operator needs to know at all times which cell you are in to get you incoming calls or allow you to call. The cells of the cellular network cover areas that usually range from 800 to 8000 m2, so it is not possible to locate a user with total accuracy, as a GPS does. The location known by the operator is always approximate, with a usual error between 400 and 8,000 m. Based on this error, it is possible to find out the neighbourhood where a terminal is located, but it is not possible to determine if it is inside a house, in the park or in the supermarket.
Therefore, the operators provide the INE with anonymised and approximate location data, without any personal data. The approximate location data of the three major operators provides a sample of more than 40 million mobile phones throughout Spain, dividing the entire national territory into some 3,200 mobility areas. The location data are not extracted from the mobile terminals but are taken from the mobile network and assigned at census district level (the minimum geographical unit used).
In addition, operators also prevent subsequent re-identification of terminals through various extra privacy protection measures: establishing very large areas in sparsely populated areas, excluding areas with less than 5000 subscribers, and so on. The study will continue as long as the health crisis lasts and until normalcy is restored.
After these operations, the INE only receives aggregated information that allows it to draw general conclusions. For example, that 3.5% of the inhabitants of a given neighbourhood go to work every day, compared to 17.9% in another neighbourhood. In summary, based on the data provided by the operators, neither the INE nor the State has any way of determining whether or not you stay at home.
Does This Study Comply with Data Protection Regulations?
The answer is a resounding “Yes”. As no personal data or any other type of data identifying users is used or provided, no cross-checking with other data sources − such as those referring to the health of individuals or their home address − is possible.
In case you have still any questions about whether even so it complies with the law, the truth is that the anonymisation of data for statistical or research purposes is a legal processing under the current legal framework, both in terms of data protection and telecommunications services. And we can’t forget the 4th whereas of the GDPR:
The processing of personal data should be designed to serve mankind.
And then, the 46th whereas foresees that some types of processing may serve:
Both important grounds of public interest and the vital interests of the data subject as for instance when processing is necessary for humanitarian purposes, including for monitoring epidemics and their spread or in situations of humanitarian emergencies.
In short: There is no reason for alarm. DataCOVID-19 does not conflict with the General Data Protection Regulation (GDPR). On the contrary, it follows the guidelines set by the Spanish Data Protection Agency.
Technologies at the Service of Health and the Preservation of Life
The Big Data and the approximate location of mobiles are two more technological weapons of the State in its fight against the pandemic, while guaranteeing the right to privacy. Thanks to this information, health agencies will be able to analyse the effects of confinement, make forecasts about the evolution of the pandemic, better understand the use of health facilities, and draw other useful conclusions in the fight against the coronavirus. With everyone’s cooperation, we will make this possible.