Bed time will never be the same

Beatriz Sanz Baños    31 January, 2019

Internet of Things has proven to be very useful in improving people’s lives in areas such as health and personal rest. This technology has also been proven to help your baby fall asleep. This very delicate and crucial task for the well-being of both parents and children can be much easier thanks to smart cribs, because they monitor babies’ sleep and improve the child´s quality of rest.

An example of IoT cradle is SNOO. It is designed by engineers from the prestigious Massachusetts Institute of Technology (MIT) and it is equipped with Wi-Fi, sensors, microphone and speakers to help babies bed down automatically. This device is connected to a mobile app that receives information from the sensors installed in the cradle. In this way, the movements, sounds and sleep patterns of the child are monitored, sending all the statistics to the smartphone for analysis.

When the cradle detects movements and / or cries, it activates the balancing mechanism to rock the baby, as well as the emission of a relaxing sound that simulates the beats that a fetus hears in the maternal womb during pregnancy. This is possible thanks to the fact that the crib has some circular plates underneath the mattress and an optimized motor to move them at a low speed.

Internet of Things has also been proven to help your baby fall asleep

Another example of a smart crib is Max Motor Dreams, designed by the automobile company Ford, inspired by the tendency of babies to fall asleep when they drive. Therefore, to help induce sleep, this connected cradle prototype aims to emulate the experience that is felt inside a mooving vehicle.

Max Motor Dreams has an electric motor, a peripheral system of LED lights and a speaker. In parallel, the car is connected to a service that registers in the smartphone the parameters of movement, sound and lighting that take place during the course of a trip. In this way the cradle can reproduce the exact movements to which the child is accustomed in his usual trips. The crib vibrates and moves smoothly and imitates the sound and the level of light produced inside a car.

In addition to smart cribs, other connected devices specific to children are also available. For example, the sensors that are installed in the cribs and monitor the baby at all times, sending images and statistics about the child’s activity to the parents’ smartphone through an app.

In addition to smart cribs, other connected devices specific to children are also available

There are also intelligent pajamas with built-in graphene microsensors that monitor the baby´s vitals and store them in an app; as well as car seats connected to the smartphone that warn the driver when he or she leaves the vehicle if their baby is still in the car, preventing them from leaving them there.

Likewise, the installation of trackers in the baby carts can provide parents with real-time GPS tracking of their location. An example of smart trolley is Smartbe Stroller, which is connected to the smartphone to propel itself autonomously and has a built-in camera to see children at all times.

All these IoT devices, along with others such as mattresses or smart pillows designed to help users of all ages sleep better, are good examples of the Internet of Things’ ability to improve the lives of people of all ages.

“The client seeks to live a different experience with each purchase”

Salmerón Uribes Marina    30 January, 2019

“The music of nougat “El Almendro” is recognized by one hundred percent of the respondents”

We all take it for granted when entering a hotel, a cafeteria or a supermarket that, in addition to the usual sound generated by the activity of other customers, we will enjoy music in the background, which makes the stay more pleasant and friendly. 

When we talk about environmental music in the establishments we talk about enjoying music in a natural way, it is a great ally of marketing not only in advertising or advertising campaigns, but also in the point of sale itself. This sound environment thus generates a value beyond the purely commercial entering the full emotional field.

“The client seeks to live a different experience with each purchase and music is one of the first perceptions that boost innovation to improve this experience”

We enjoy music in different ways and in different places, because it is something that, in general, everyone likes and also relates to leisure time, so it is a great ally of marketing, not only in advertising, but also at the point of sale. So much so that the choice of music is a decisive element for a good customer experience in the establishment, a good construction of the corporate image by the consumer and for the retailer to see increased sales thanks to factors such as rhythm or the sonority. The choice of music can increase or decrease the chances of success of any business today.

Spotmusic knows the value of music at the point of sale and was born with the aim of creating an appropriate musical environment that really influences the activity and the response of customers. The service offers 24 hours of musical ambience to the establishment, without interruptions or publicity. In addition, it gives access to more than 100 music channels, prepared by a team of musicologists, created so that each establishment chooses that style that best suits their type of customers and their brand identity.

It is a solution that encourages a positive feeling of experience in clients, brightening situations that might otherwise be annoying, such as waiting for goods to go up in the warehouse or having to wait in a long line to pay.

Audio marketing takes advantage of the emotional power of music to link users with the brand and favor consumption. A well-chosen music that is appropriate to the product that is offered and to the public that demands it facilitates the recall of the brand image and significantly enriches the customer’s journey. In addition, a good musical atmosphere stimulates users’ perception of consumption time as leisure time, stimulating purchases and, therefore, commerce.

In addition, we must take into account the volume of the melodies, since it can contribute negatively if it is excessively high, because, far from increasing the enjoyment of customers, it will make them feel annoyed.

According to an Audio Branding study conducted by the IAB, the Windows start music is recognized as such by 73.77 percent of the respondents and the nougat “El Almendro” by one hundred percent

This retail solution is aimed at both small and large companies or multinationals, regardless of the sector they belong to, although for practical purposes, companies dedicated to tourism or hospitality and offices are those who request it the most.

The spotmusic service has already been adopted by important companies such as Gocco or Sodexo, which have at their disposal, not only the technical tools that make possible a simple and autonomous online management, but a team of experts in musicology that elaborate specific channels according to the type of deal.

Expert advice is a great help, since it determines the user experience according to your expectations when you visit the point of sale. For a clinic, for example, it will be necessary to choose a quiet music that facilitates the relaxation of the patient. However, it will not be like that in a supermarket, where music is an essential factor to encourage a purchase that can sometimes be tedious for customers.

By incorporating this service, the establishments can remotely activate the musical channels they want to use in their businesses thanks to a management panel. A cafeteria, for example, can select different channels by time slots, interspersing them with promotional locutions or current campaigns, so that customers who are there having lunch can enjoy them.

All this places us in a scenario in which the communication between the brand and the client is no longer unidirectional, since music is a language that both understand very well. If a business puts dance and deep house at full volume, those who identify with the style and who, therefore, feel linked to the brand, will enter the shop.

Python for all (2): What are Jupyter Notebooks?

Paloma Recuero de los Santos    30 January, 2019
What are the Jupyter Notebooks? The Jupyter Notebooks is maybe the most well know application of the Jupyter Project, created in 2014 with the objective of developing open-source software, open-standards, and interactive computer services compatible with different programming languages.

Jupyter Notebooks is a web application, also the open code that allows us to create and share documents with the live code, equations, visualizations and explicative text. These documents register the whole development process and, the most interesting thing, they can be easily shared with other people through email, dropbox, control systems such as git/GitHub and nbviewer.

Amongst its uses are:
  • The cleaning and transformation of data
  • Numerical simulation
  • Statistical modelling
  • Automatic learning
  • And much more

Why Jupyter?

As a curiosity, they ´christened´ it Jupyter for different reasons. On the one hand, for the scientific connotations to the allusion that the planet “Jupiter” brings, whose moons were protagonists of what is considered one of the first scientific publications supported by data, thus guaranteeing its reproducibility. Given that one of the objectives that inspired the Jupyter project is precisely this: facilitate the sharing and reproducibility of projects and experiments (scientific and all types), its creators decided to reflect this in its name. On the other hand, although Jupyter isn’t exactly an acronym, it pays tribute to Julia, Python and R, the programming languages running to support this environment. In particular, the central ´y´ was chosen to pay homage to the Python heritage, since Jupyter emerges as an evolution of iPython.

Who uses it?

Anyone who works in software development or the technological environment in the broadest sense, from the secondary school students who are taking their first steps in programming to the more specialised scientific engineers. And, now, us too!

And so , Jupyter Notebooks are used in academic environments (UC Berkeley, Stanford, UW, NYU, Cal Poly etc.), public sector investigations (NASA, JPL, KBase), and also in the private sector (IBM, Facebook, Microsoft, Bloomberg, JP Morgan, WhatsApp, Quantopain, GraphLab, Enthought, etc.) In terms of an architect of open modules, they’re are widely used to create all types of solutions and services, equally in commerce and non-profit.

How do we access it?

We can use it remotely or locally. From their own web page of the project, we can choose to try in in our browser, or install it locally ´install notebook´. Also, you can directly install it using the installation tool included in the package of Python pip.  Even so, the simplest way of installing it (in fact, what we have used ourselves in the previous post), is with Anaconda.
Página web del proyecto. Podemos probarlo desde el browser o instalarlo en local.
Figure 2: Project web page. We can try it from the browser or install it locally.
Thus we now have everything ready to start using this new working environment that is revolutionising the wat we work in the world of data.

Accessing the environment.

In the previous post, we installed Anaconda. If everything worked correctly, we can now open Jupyter Notebook directly from the Windows menu:
 Acceso desde el menú Windows.
Figure 3: Access from the Windows menu
It opens a command window that automatically launches the interface that is divided into 3 windows: ´Files´, ´Running´, and ´clusters´. By defect, it opens ´Files´ which I where we can create new notebooks or open an existing one.

Interfaz de Jupyter.
Figure 4: Jupyter Interface

We create a notebook and name it.

To create a notebook, we don’t need to do anything more than select the option ´New´ that appears in the top right corner.
Desde el menú New creamos un nuevo notebook Python.
Figure 5: From the menu”New” we create a new Python notebook.
Once we have created it, we give it a name by clicking on ´Untitled´.

Se crea como "untitled".
Figure 6: It´s created as ´Untitled´

We can call it  MyFirstNotebookPython  (or whatever we like!).

Renombramos el notebook.
Figure 7: We re-name the notebook.
Now we can see, at the end of the list, our new notebook. We can also see its status ´Running´.
 
Al final de la lista aparece el nuevo notebook.
Figure 8: The new notebook appears at the end of the list
We open it by simply clicking on the name and we see that it consists of a series of cells in which we can directly write the code. By clicking in the cell, we observe the change in colour of the border from blue to green. This means that we change from command mode (blue) to edit mode (green). To change back is as easy as clicking inside the cell or clicking escape.
 
 
Celda en modo comando (azul).
Figure 9: Command mode cell (blue).
Once in edit mode, we can start to write the commands that we would like. You can try this with one that you have used in the previous tutorial post, or with something simple like this. Write:

  print (“loquetúquieras”)
 
Probamos con el comando print.
Figure 10: We try the print command.
 
To carry out the code, you can choose the option ´Run cells´ in the ´Cell´ menu, or press ´Ctrl+Enter´– the result appears just below:
Resultado de ejecución de la celda.
Figure 11: Result of carrying out the cell

Creating a “Checkpoint”

Another of the most interesting functionalities of Jupyter Notebook is that it gives you the chance to create ´checkpoints´ or points of reference. When creating a checkpoint what you do in reality is save the status of the notebook in the exact moment so that you can return to this exact point and undo the changes that have been made after. This, evidently, is very interesting when you are doing tests or if something doesn’t turn out well. You can return without problems to the point where all was right without the need to start again from the beginning.

To create a checkpoint, you only need to select the option ´Save and Checkpoint´ from the ´File´ menu. To return to a previous checkpoint, you just have to select what you want from the menu ´File/revert to checkpoint´. 

Cómo volver a un checkpoint anterior.
Figure 12: How to get back to the previous checkpoint

Exporting a Notebook

Finally, to export a notebook, you must select the option that interests you the most from the menu ´File/ download as´. You can select from the formats notebook (ipynb), python (py), html, markdown, latex, pdf etc.

In the next post we will talk about the libraries and we will finish preparing the environment for our experiment of Machine Learning. Until them, we recommend you explore the help menus in Jupyter Notebooks and test out some of the simple commands to start getting to know the environment a little. In this video by CodingtheSmartWay you will find some examples to practice with.

Post-Quantum Future Is Around the Corner and We Are Still Not Prepared

Gonzalo Álvarez Marañón    30 January, 2019
Post-quantum future image

Every year we have more powerful computers with a higher calculation capacity, is that fact good or bad? Think twice before giving an answer. 

It depends. Because if global information security is based on the computing complexity of some functions, then the fact that computers are becoming ever faster will be very bad news.

In fact, the sword of Damocles is hanging over our public-key encryption systems: RSA, DSA, ECDSA. Their security relies on the difficulty of achieving certain mathematical problems currently considered as untreatable, such as factoring large integers or solving discrete logarithms. The quantum computing big promise is that these mathematical problems will be rapidly solved. Is cryptography mortally wounded? Is there a hope for the world? Will we be able to continue communicating securely after the first quantum computers? Let’s see it step by step.

What do we mean when we assert that a quantum computer is faster than a classical one?

In Computer science, the objective is to find the most efficient (so, the fastest) algorithm to solve a given computational problem. With the aim to compare the efficiency of different algorithms, they are needed mechanisms for classifying computational problems according to the resources required to solve them. Such classification must be capable of measuring the inherent difficulty of the problem, regardless of any particular computing model. The resources to be measured may include: time, storage space, number of processors, etc. The main approach is usually the time and, sometimes, the space as well.

Unfortunately, predicting the algorithm’s exact execution time regarding any kind of input is often difficult. In such situations, this time is approximated: it is examined how algorithm’s execution time increases as the input seize scales limitlessly.

To represent these execution times and make their comparison easy, the big O notation is usually used: O(f(n)). According to this asymptotic notation, if a running time is O(f(n)), then for large enough n, the running time is at most k·f(n) for some constant k, although it may, of course, grow more slowly. The f(n) function represents the worst-case running time. For having a clearer idea of efficiency, the following list of functions in asymptotic notation is ordered by slowest to fastest growing:

list of functions in asymptotic notation imagen

The secret for the speed of future quantum computers will rely on quantum algorithms, which are capable of tapping into qubits superposition. As it has been said, the execution time –for both classical and quantum algorithms– is measured by the number of basic operations used per algorithm. In case of quantum computing, this efficiency may be measured by using the quantum circuit model: a quantum circuit is a sequence of basic quantum operations called quantum gates, each one applied to a small number of qubits.

The nemesis of cryptography is the so-called Shor’s algorithm: the exponentially fastest quantum algorithm when calculating discrete logarithms and factoring large integers, thereby capable of breaking RSA, DSA and ECDSA. Regarding the factoring challenge, given an integer n = p × q for some prime numbers p and q, our task is to determine p and q. The best classical algorithm known, the general number field sieve, runs in time exp(O(ln n)1/3(ln ln n)2/3)), while Shor’s quantum algorithm solves this problem substantially faster, in time O((log n)3). 

The Grover’s algorithm is also well-known. This algorithm can speed up searches on unordered data sequences of n elements. Indeed, one of the most basic problems in computer science is unstructured search. This problem can be formalized as follows: given the ability to evaluate a function f: {0, 1}n → {0, 1}, find x such that f(x) = 1, if such x exists; otherwise, output “not found”.

It is easy to see that, with no prior information about f, any classical algorithm which solves the unstructured search problem with certainty must evaluate f a total of N = 2n times in the worst case. Even if we seek a randomized algorithm which succeeds, say, with probability 1/2 in the worst case, then the number of evaluations required is of order N. However, Grover’s quantum algorithm solves this problem using O(N1/2) evaluations of f in the worst case. As you can see, it is not as dramatically fast as the Shor’s algorithm, nor does it constitute such a serious threat, since it can be counteracted just by doubling the key size.

Currently, it seems there are no more algorithms in the quantum cryptanalyst’s repertoire. So, let’s see how Shor and Grover would affect the current cryptography if they could be executed nowadays.

What would happen if currently we really had a super-powerful quantum computer?

In the last entry we saw that the first error-corrected quantum computers of several thousands of logical qubits are expected to be built, at least, in 10 years. What would the state of classical cryptography be when facing quantum attacks?

  • Asymmetric-key cryptography

Let’s provide a context for these data: to break a 1024-bit RSA key would require a quantum computer that has around 2,300 logical qubits and less than 4 hours of computing time. It is a matter of such seriousness that the U.S. National Institute of Standards and Technology (NIST) has launched a project called Post-Quantum Cryptography, in order to look for options that could replace the current algorithms. The selection process for candidates is expected to have ended in 5 or 6 years.

  • Symmetric-key cryptography

The current standard for symmetric encryption is the algorithm called AES-GCM (Advanced Encryption Standard-Galois/Counter Mode), that supports three variable-key sizes: 128 bits, 192 bits or 256 bits. For instance, for a 128-bit key, a brute-force attack requires trying all the possible values of the key, in other words: 2128 combinations. The Grover’s algorithm can quadratically speed this search up, meaning that it would require 264 operations. Consequently, to execute it on a quantum computer would require around 3,000 qubits and more than 1012 years, an extremely long time. Even if such quantum computer existed already, the solution would be as simple as doubling the key size, something that could be done in a practical way at any time.

Of course, someone could discover a quantum algorithm much more efficient than the Grover’s. In such a case, AES-GCM would have to be replaced.

  • Hash

Hash functions are used in all type of applications: password hashing in databases, mathematical problems for bitcoin mining, building of data structures such as the Merkle three, message digests for digital signatures, password-based key derivation functions, etc. SHA256 remains the most frequently used algorithm nowadays. Current hashing algorithms are not expected to be impacted by quantum computing, since the Grover’s algorithm is considered not to be capable of breaking a hash like SHA256.

However, password hashing is at a higher risk because the space of user passwords is not very large. For example, given a 100-symbol alphabet, the set of all 10-character passwords is only about 10010 ≈ 266. Using Grover’s algorithm, the running time shrinks to only 233, so a hypothetical quantum computer may take just a few seconds. The quantum threat shadow is another reason to move towards alternatives to passwords. Examples of this are the projects on which our ElevenPaths’ Innovation and Labs team is currently working: CapaciCard, SmartPattern, PulseID or Mobile Connect.

Another widespread hash application is the proof-of-work systems, used by cryptocurrencies such as Bitcoin or Ethereum. To validate a chain block, miners must solve a mathematical puzzle that involves calculating millions of hashes. Fortunately for Blockchain, a quantum computer would need more than 10 minutes to solve the current challenges, so cryptocurrencies will remain safe, at least on that side (but not on the elliptic curve cryptography’s one).

 classical cryptographic through quantum algorithms imagenn
Summary table on the estimations to break classical cryptographic
through quantum algorithms, and potential countermeasures (Source: Quantum Computing: Progress and Prospects)
The long and tortuous cryptographic migration path

It can be said that modern cryptography was born in the mid-70s, with algorithms such as DES, for symmetric encryption, and Diffie-Hellman, for key establishment. Since then, there have been a handful of changeovers from an algorithm widely established to another one. Some of these migrations have been: from DES and Triple-DES to AES; from MD5 and SHA-1 to the SHA-2 family; from the RSA key transport and the Diffie-Hellman finite field to the Diffie-Hellman key exchange elliptic curve; and from the RSA and DSA certificates to the ECDSA certificates.

Some of these changeovers have been successful: AES can be found almost everywhere and modern communication protocols mainly use the ECDH key exchange. However, those changeovers involving public key infrastructure have been unequally successful: browser providers and Certification Authorities have gone through a lengthy transitional period from SHA-1 to SHA-2 on certificates –with repeated delays–, the changeover to elliptic curve certificates has been even more slow and, in spite of it, most of the certificates issued for the web continue using RSA.

In the medium term, it is likely we will be forced to experience a new changeover: towards the post-quantum public key cryptography.

There is cryptographic life beyond RSA, DSA and ECDSA

First of all, it is important to spell out that when we talk about ‘the end of RSA, DSA and ECDSA’ and ‘the end of cryptography’, we are talking about two different things. There is cryptographic life beyond RSA, DSA and ECDSA. Since the mid-80s, there have been cryptographic algorithms based on the difficulty in solving mathematical problems different from the integer factorization and the discrete logarithm. The three best-studied alternatives are:

  • Hash-based cryptography: as its name suggests, it uses secure hash functions resisting quantum algorithms. The disadvantage is that it generates relatively long signatures, so limiting its use scenarios. Leighton-Micali Signature Scheme (LMSS) is one of the strongest candidates to replace RSA and ECDSA. 
  • Code-based cryptography: the coding theory is a mathematical specialty on information coding rules. Some coding systems are quite difficult to decode, in a manner that they often require exponential time, even for quantum computers. The best-studied cryptosystem so far has been McEliece, another bright candidate for key exchanging.
  • Lattice-based cryptography: it might be considered the most active research field on post-quantum cryptography. A lattice is a discrete set of points in space that has the property that the sum of two points on the lattice is also on the lattice. A difficult problem is to find the Shortest Vector Problem of a given lattice. All classical algorithms require a time that growths exponentially according to the lattice size to be solved, and it is thought that the same will apply to quantum algorithms. Currently, there are several Shortest Vector Problem-based cryptosystems.

Consequently, the great difficulty is not the lack of alternatives. The painful problem will be the time of transition to one of the alternatives previously described:

  • Firstly, postquantum algorithms must be selected and standardized by institutions such as the NIST.
  • Then, the standard must be incorporated into the cryptographic libraries currently in use by the most popular programming languages, cryptographic chips and hardware modules.
  • Afterwards, they must be integrated within the cryptographic standards and protocols, such as PKCS#1, TLS, IPSec, etc.
  • After that, all the sellers must include these standards and protocols in their products: from hardware manufacturers to software developers.
  • Once all software and hardware products have been updated, it will be necessary to perform again the following actions: issue all the certificates, encrypt all the stored information, sign and distribute the code, get rid of all the old copies, etc.
  • How long this process will be? Considering previous migrations, such as the SHA-1 and SHA-2 ones, and taking into account the additional complexity, no less than 20 years. When the first quantum computers capable of attacking RSA, DSA and ECDSA are expected to be available? Not before 15 years.

This is the current outlook. Let’s hope that the changeover process will gain momentum. Nobody knows for certain how far are quantum computers. However, just in case, better to be prepared.

Gonzalo Álvarez
Innovation and Labs (Elevenpaths)

Towards the Fourth Industrial Revolution

Carlos Alberto Lorenzo Sandoval    28 January, 2019
The customization of mass production is the catalyst for change that, among other things, originates industry 4.0

A lot has happened since the steam engine changed the course of our civilization history with the mechanization of production in the First Industrial Revolution. Scientific advances at the end of the 19th century allowed the Second Industrial Revolution with the discovery of electricity that would be the basis for mass production, followed by a Third Industrial Revolution in the 20th century thanks to the power of information technology and electronics in the automation of production processes.

Today, a Fourth Industrial Revolution is brewing, a product of the merge of a series of exponential technologies such as Big Data, Artificial Intelligence, the Internet of Things (IoT), additive manufacturing, augmented reality, among others, which are blurring the barriers between the physical and the digital.

 
A lot is said about this new stage of the industry. Already, large corporations are moving to adapt to the changes that this entail and are beginning to extract benefits around Industry 4.0. Hence, to understand why this revolution is so unique, we need to realize the role we have as individuals in this new paradigm shift.

The great difference of the Fourth Industrial Revolution, is that is not a cause of change but a consequence. In the case of the Second Industrial Revolution, it resulted as a consequence of the introduction of mass production. It was the revolution that brought the change in the mode of consumption, while the Fourth Revolution is being forged, to a large extent, as a consequence of the new consumption habits of the people. The customization of mass production is the catalyst for the change that, among other things, originates Industry 4.0.

Change in consumption and the manufacturing process

The change in consumption means that it is no longer enough for the product to be as cheap as possible, but also that the product must adapt to everything we need as consumers and what we demand according to our tastes. All in one click and with free shipping. And this is precisely the great change that has forced companies to find more agile and fast solutions that respond to their customers, now hyperconnected, accustomed to booking a flight, asking for a taxi or making a payment, all remotely.

From the point of view of the industry, the change not only consists of the eternal quest to save production costs, but to respond personally and quickly to their consumers. The solution has been found in the digitization of everything; from design to manufacturing.

Digitization brings a key fact; with all being connected, an infinite amount of data is generated, extremely useful to know more and better both, the production chain and the customer. Therefore, having an impact not only improving the operational efficiency but also the generation of new revenues towards the final consumer.

Previously, if a problem arose during the manufacturing process, the whole production chain had to stop to evaluate where the problem was, to fix it, and start it up again. This involved substantial economic losses for every minute the production chain was stopped. The same thing happened when some maintenance of the machines had to be done, always with fixed dates and without taking into account whether it was really necessary or not. Now the new connected industry would prevent the processes from stopping. If everything is communicated, and information is available in real time, we are able to modify actions and predict failures before they pose a real problem, causing to stop the production process.

Thanks to Big Data technologies we are able to carry out predictive maintenance of machines and processes, thus guaranteeing greater efficiency within the industry.

But not so fast …

Of course, these changes do not come with insurance at all risk, and therefore we must pay the consequences of automation; employment. According to the study “World Bank Development Report (2016), machines could replace 57% of average jobs in OECD countries, 69% in India and up to 77% in China. Therefore, it affects more, those countries where traditional industry has a leading role. Another study; The Future of Employment (2013) speaks about other sectors; a telemarketer has a 99% chance of being replaced by automation; a supermarket cashier 98%, a legal assistant 94%, a taxi driver 89% even a chef of a fast food chain 81%.

These forecasts only show static data of one side of the coin. The reality is that we will seek to adapt the change and dominate it; new professional profiles will emerge. Profiles such as “Community Manager” for example, were unthinkable a decade ago, or the Data Science discipline. But in the not so distant future we will speak about specialized lawyers in drones and cybersecurity or organ designers. What is clear is that in the future, human skills will prevail over the knowledge of something in particular. A person will be valued more when it comes to being hired for their leadership or management skills, or their creativity over their knowledge about anything that apart from changing at great speed is accessible to everyone through a mobile device.

You can also follow us on TwitterYouTube and LinkedIn

The hugest collection of usernames and passwords has been filtered…or not (I)

ElevenPaths    28 January, 2019
Sometimes, someone frees by mistake (or not) an enormous set of text files with millions of passwords inside. An almost endless list of e-mail accounts with their passwords or their equivalent hash. Consequently, headlines start to appear again and again in the media: “Millions of passwords have been filtered…”. Even if it is not a fake headline, sometimes it may be tricky. In particular, we are talking about the last massive leak, named “Collection #1”.

We have analyzed this huge leak. Beyond the “Collection #1” that has reached the media, we have got a superset with more than 600 GB of passwords. It is so great that over our analyses we could count more than 12,000,000,000 combinations of unfiltered usernames and passwords. It is an astronomical figure. However, the important point here is that they are “in-raw”. What is still interesting after having performed any cleaning? We must consider that a filtration of a filtration is not a filtration. If some months or years ago someone filtered a database of a given website, this is called “leak”. Conversely, if someone concatenates that file with other ones and publishes them, it is not a filtration: they are simply making available their particular collection of leaks on the Internet.

Demystifying the leak: Repetitions

Demystifying the leak: Repetitions imageRepetitions are classified into two types: 

  • Occurrence of the same account and password
  • Finding the same account but with a different password 

In both cases, it can be just a reutilization of an e-mail account and password on multiple sites, as a result of the union of different filtration databases. In both cases (regardless of if they are valid and out-of-context) we can reduce the “different” data. A quick glance at these 600 GB of information shows us a lot of repeated accounts. Although this information may be valid, it helps to low the possibilities of affected users.

Data expiration
How valuable is a 6-month leak? What about a 5-year one? And a 10-year leak? Getting an e-mail account and password does not mean having permanent access to the secrets hidden behind the authentication process. Every single day these data are less valuable. In general, this kind of data is like fish: it must be eaten fresh, otherwise it rots very fast. When someone has access to an account with its appropriate credentials, they have a time frame until the account’s owner is alerted, so this one will change the password or the service itself will detect the account filtration and go ahead with its disabling or preventive deletion.
This tight time frame or access lifetime is the account’s initial value (then, other properties come into play, such as the domain they belong or even better: their owner). Afterwards, the e-mail account and credentials will be useful just to take a chance on other services, use them to send spam or other frauds; but that is another matter.
We have performed a simple test. We have concatenated all the files containing e-mails within the megaleak and we have removed all the passwords. The result: a “todos.txt” of around 200 GB. From them, we have selected a group of accounts on a pseudorandomized basis (as randomized as mathematics and system generators allow us): 
 Data expiration image
Fictitious data? imageThe ‘0.0001’ extracts a minimum sample, however, they mean  more than a thousand e-mail accounts. Moreover, “salida.txt” is filtered on e-mails with non-existent domains, duplicates and servers that do not allow to verify an account through VRFY (a command of SMTP).
Based on that sample of more than a thousand e-mails, we have verified their existence. The result: 9,8 % did not exist or never existed in that domain. Nearly 10 % of the “working” e-mail addresses are no longer available on their corresponding e-mail servers. We dare to say that this result can be extrapolated to the mentioned 12,000,000,000 combinations. And all this without considering than in many cases the passwords are not even valid.
Fictitious data?
Let’s see some entries. Pay attention to the domains that does not exist or never existed, since they are not domains gathered by IANA.
This is an illustrative example. There are thousands of non-existent TLDs within the multiple files that constitute the leak.
Another suspicious example is the content of some files itself, let’s examine it:
example is the content of some files itself image
The grey rectangle we have placed in order to not expose the data may mislead, but it constitutes a list where the chain [email]:[password] consists of 32 characters exactly; no more and no less. 32 characters where maybe because of the e-mail or password’s length, all have the same size and figure a column which is suspiciously perfect. The attacker may have organized them, but in any event it is curious, since it is not a single file with thousands of e-mails of the exact same length. Within the leak there are other files where the chain length is both higher or lower, but homogeneous in any case. We cannot imagine the practical utility of having chain lists formed by same-length e-mails and passwords. Might we assert that they have been generated this way by any means?
So, is it serious?
Theoretically, it would be necessary to validate a number of factors; but with 12,000,000,000 combinations, the operation results, at least, complex. Just by these samples and examples we could venture to assert that this collection constitutes a valuable set of data, not in terms of privacy or destruction of users’ privacy, but as a dictionary of accounts’ system.
We think that concluding by asserting that “a filtered account corresponds to have access to someone’s e-mail or data” is a reckless reasoning. The useful number of these accounts is much more reduced, due to their expiration or simply because they never existed. We think that within the leak there is out-of-date or unverified information and, even so, it has been artificially enlarged.
In any case, the good point of these ads is that they make a small proportion of the general public to change their passwords, an even smaller proportion of them get a password manager and just a few of them enable the second authentication factor. It’s better than nothing.
In the second part we will see more curiosities on this huge file.
David García
Innovation and Labs (ElevenPaths)

Python for all (1): Installation of the Anaconda environment

Paloma Recuero de los Santos    25 January, 2019
Now that we have defined the objective of this tutorial in the previous post (“Dare with the Python: an experiment for all”), we will really begin to work. We kick off our first project in Python with the installation of the environment, the installation of the SciPy platform and the scikit-learn library. We can do it directly from the Python page, however, we will do it in the simplest way (recommended for beginners), which is through Anaconda. Anaconda installs Python, the development environment and the most important libraries for Mathematics, Science and Engineering, avoiding the difficulty of installing the different packages independently and the possible compatibility problems that may arise.

In this first tutorial post, we will learn how to install an environment to develop Python, with which we will then begin to learn, practice, and develop Machine Learning and Deep learning software.

Downloading the Anaconda software

Anaconda is an easy and free environment of Python for Data Science. The first strep consist of visiting the web page of Anaconda and selecting ´Download´ on the download’s page.

Homepage de Anaconda.
Figure 2: Anaconda Homepage

We will download the version 3.6. We will select the installer of 32 bits or 64 bots according to what we have installed in our operating system. (If in doubt, this information should be in the systems settings menu, option ´About´ or similar).

Seleccionamos el instalador para nuestro S.O.
Figure 3: We download the installer for our S.O. 

Once the software is downloaded, we can launch the installer. We will go through the usual questions:

Lanzamos el instalador.
Figure 4: We launch the installer.


If the installation is only for the user who has initiated the session or for everyone in the team who will use it (in this case, it will be necessary to appoint administrative privileges).

Instalación por usuario o por equipo.
Figure 5: Installation for user or for team

The installation route:

Indicamos la ruta de instalación.
Figure 6: We select the installation route

How we want to integrate our software (in this instance Windows), we select the recommended defect option.

Seleccionamos cómo queremos que se registre en Windows.
Figure 7: We select how we want it to register in Windows

 And that’s all. Simple, right?

Final de la instalación.
Figure 8: Finish the installation
Now we can access Anaconda Cloud, which consists of a management service where we can search packages, access, store and share private notebooks, environments, and the SciPy and conda packages. The Anaconda cloud also makes it easier for us to keep abreast of updated versions of the packages and environments that we are using.
Anaconda Cloud.
Figure 9: Anaconda Cloud.
Now, we will make sure that Python has installed correctly. For this, we will open Anaconda Prompt (from one´s own Windows menu ´recently added´) and we will launch the Python interpreter with the Python command. We obtain the following result, that shows us this version.
 
Figure 10: 

If you write ´help()´ you can access the interactive help feature. Also, you can practice with this tutorial. In particular, we recommend the section ´3 An Informal Introduction to Python´.

In the following post, we will learn about Jupyter Notebooks, create on of our own and get everything ready to start working on libraries. Don´t miss it!

All the posts in this tutorial are here:

It’s one thing to be told you’re safe and it’s another thing to feel safe

Beatriz Sanz Baños    24 January, 2019

We can confirm that Smart Cities nowadays are already a reality and not a futuristic promise that is yet to come. Although there is still a long way to go in this area, there are already many municipalities that are carrying out their digital transformation process based on Smart Cities initiatives.

The fundamental objective of these initiatives is to connect urban elements to obtain information about them and their environment; and thus allow the municipal teams to make better decisions.

For some years now we have been observing how the municipalities implement technological initiatives to optimize, on the one hand, public resources and, on the other, to offer a better service to the citizen.

Smart Cities are already a reality

Although each municipality prioritizes and implements the different initiatives based on their needs and their situation, there are projects that are already a reality. Some of these projects are:

  • Light management: optimize the use of public lights to reduce the cost of electricity by more than 30% thanks to telemanagement and predictive maintenance.
  • Waste management: achieve a more efficient management of urban waste, optimize service costs and reduce the ecological footprint.
  • Parking management: optimize the occupation of public parking spaces while reducing traffic and pollution levels in the city
  • Management of security and emergency bodies: effectively distribute agents in the field while reducing the response time of any incident. 

Many of these initiatives have a common factor: the management of assets in mobility, whether vehicles, waste containers or personnel in the field. That is why mobility solutions are fundamental in any project of this nature.

Knowing in each moment the situation of all these assets, programming the most efficient routes based on the necessary work, ensuring the correct use of the vehicles or responding quickly to an unexpected need are some of the benefits and solutions that this type of internet of things brings.

Some of our clients, as is the case of the municipality of San Nicolás de Los Arroyos in Argentina, has already managed to recover the investment in a fast way thanks to the saving of fuel, the reduction of maintenance costs and the reduction of incidents of traffic.

The Municipality has obtained real benefits from the integration of digital solutions, such as:

– Patrol vehicles have saved up to 20% fuel.

– Patrolled streets have been increased from 80% to 100%.

– The response time of the patrols has been reduced by 15%.

On the other hand, the residents of the city have been able to improve the way they make their queries, suggestions and complaints to the municipality as it has made available channels such as social networks, sms, web forms or a telephone contact available. All this has allowed improving the attention of the neighbors in a centralized and fast way.

“It is one thing to be told that you are safe, and another thing is to feel safe” – says Soledad Belaza, a resident of the Municipality – since 100% of the streets of the city are patrolled through the installation of GPS devices.

“It is one thing to be told that you are safe, and another thing is to feel safe”

In addition, the Municipality decided to launch an information campaign on the routes of garbage trucks, thanks to the GPS devices installed in the vehicles. In this way, the residents of San Nicolás de los Arroyos know when to take the garbage out and thanks to that the city is cleaner and tidier.

“In a city nothing is easy, – says Cesar Mediavilla of Telefónica – but when connecting things, they give us information as simple as knowing where a vehicle is at each moment”; and from here, we can change many other things and most importantly, improve people’s lives and their safety.

Know more about the digital transformation of San Nicolás de los Arroyos. 

The acceleration of the insurance sector in Big Data

AI of Things    23 January, 2019

Of all the applications of Big Data, insurance companies demand to exploit the value of their data to enhance their relationship with their customer, from the acquisition to loyalty.

In the insurance sector, there are few moments of interaction with the customer thereby few occasions to obtain information from them. Therefore, offering a personalized and agile service has become crucial. For this, data is fundamental as a raw material for the business intelligence, which far from being a complementary tool, as is seen in other sectors, is an asset to be exploited in a highly competitive and customer-focused sector.

Today practically all companies in the sector are hiring experts to drive the digital transformation process, many of them lacking clear business objectives that mark their roadmap. The turnover of a company is not always linked to their Big Data maturity and few companies have managed to successfully exploit the value of their data.

Insurance companies have a huge amount of data generated over the years and one of the main problems they face is precisely knowing how to manage their own information. This information is vital for the organization, since the profiling of the client is the essence of the insurance and the sector. Hence, analyzing both the profile of the client and their behaviour in the past, the interaction with the brand, with the products and the use of different policies, we can discover patterns that allow us to predict the client´s future behaviour. Thus, a greater knowledge of the client will serve as a basis for the development of initiatives based on the value provided by data which can result in the generation of new revenues, the improvement of operational efficiency or the fraud and risk detection.

Big Data applied to the life cycle of the client in the insurance sector

The data that insurance companies naturally possess is related to the different phases of the customer’s life cycle, which is why there are clear work areas in which the exploitation of data plays a differential value:

1. Acquisition: a typical use case would be dynamic pricing, which allows calculating in real time and with the data facilitated by the potential client the risk index for the company. Based on this risk, the customer profile is determined (based on its potential value) and the appropriate premium is calculated.

2. Loyalty: through cross-selling and up-selling actions the life of the client can be extended to maximize the commercial relationship. Once the client’s value is identified, the insurance company data can be crossed with external data. For example: data and statistics from the INE, the landing registry, meteorology, as well as traffic data and the type of car trip (in the case of automobile insurance). Currently, the treatment of external data is a plus, but in the future, does who not  use it will be at disadvantage.

3. Risk Prediction: Big Data also allows predicting possible defaults and even the churn of customers to another insurance company before they make the decision. This is possible thanks to the collection of customer dissatisfaction information and the search for correlations that identify variables or events that alarm and predict the customers churn. 

4. Fraud detection: Finally, Big Data helps in a key point of any organization, which has to do with the management of anomalies that can occur during an accident. The analysis of all sources of information allows to identify irregular patterns of customers behaviour as well as it helps optimizing the management of suppliers (cranes, fleets, etc.) adjusting the quality and cost of the service that the insurer receives from its suppliers when a client has an accident. Additionally, the analysis of the data can serve to detect irregularities in the commercial team of the company or in the agency network.

Steps to become a data-driven insurance company

The main barrier of insurance companies is usually the disorganization of the information they have. Normally, data is distributed in independent silos depending on the department they come from, without homogeneity or any connection between them. One of the main challenges is precisely to collect and share all the data at a corporate level so it becomes part of a unified repository that serves as a starting point for further analysis.

Knowing the preparation and training available to people in the company, the work done in the identification of business initiatives, the characteristics of the existing databases, as well as the infrastructure and technologies available will determine the Big Data maturity and define future objectives. The Big Data strategy of a company must be led by business. It is not about technical issues or the implementation of technologies, as the strategy has to be associated with clear business objectives where data will respond to specific problems, being key to determine the direction of the company’s strategy.

Although we see a contact of the insurance sector with the digital transformation, this 2018 will undoubtedly be the year of Big Data takeoff in the insurance sector.

By Alfredo Martinez

Don’t miss out on a single post. Subscribe to LUCA Data Speaks.

You can also follow us on TwitterYouTube and LinkedIn

Movistar Car: transform your vehicle into a connected car

Beatriz Sanz Baños    22 January, 2019

The application of the Internet of Things in the devices of our environment has brought about an authentic revolution in our lives. Connectivity gives us an efficient digital life, both at work and in our private lives, including leisure time.

The trend of connected cars begins to become a reality and is one of the areas with the greatest potential for growth. In fact, Gartner estimates that in 2020 there will be 250 million cars connected worldwide.

It’s in this context that Movistar Car was born, the new Telefónica service that converts vehicles into connected cars.

With Movistar Car we can connect our car to a 4G or Wi-Fi network to make it as safe and smart as possible. This connectivity allows a more efficient management of our vehicle tasks from driving to maintenance or the purchase of fuel.

How does it work?

Movistar Car consists of a small device (the whole installation can be done by the driver in a simple way) and an application on the mobile to manage the services associated with the product.

After installing the device in the vehicle and downloading the application on the smartphone, the user just has to register to enjoy the following advantages:

Connectivity

Movistar Car provides a Wi-Fi network exclusively for the car (with 3GB available per month). This network allows you to connect up to five devices simultaneously to navigate without consuming Mobile data.

In this way, the passengers accompanying the driver can navigate on their smartphones or tablets or enjoy their series or favorite movies during the journeys, making the trip into a leisure experience for the consumer.

Security

If the car suffers and an impact, Car Movistar automatically sends a call to a platform that initiates the assistance protocol. This includes a call to the driver to check their status and contacting the 112 emergency services if necessary.

Maintenance

With Movistar Car the driver is aware of all the details related to their vehicle and receives notices of possible faults, making maintenance easier. Also, it is possible to program reminders through the application like the date of the ITV or upcoming revisions.

Location

Movistar Car gives access to the historical record of journeys made by the car and sets up alerts that inform of the movements of the vehicle and its location at all times.

In addition, thanks to the browser included in the application, the driver can go to the selected or stored destinations, following the most efficient routes depending on the state of the roads.

Saving

Movistar Car makes saving easier through agreements with third parties, exclusive offers in fuel, discounts in workshops, better conditions in insurance rates and more advantages associated with the car and its displacements.

The connected car is destined to consolidate itself as one of the most relevant spaces in people’s lives. Movistar Car is a step further, contributing to the user experience with their vehicles and to promoting road safety, having a positive impact in society as a whole.

Movistar Car can be booked already on the website of Movistar. The service is available for gasoline cars manufactured since 2004 and for diesel cars manufactured from 2005.