Do not miss what happened at NID 2018: WE ARE IN

AI of Things 6 July, 2018

Under the claim We are in, Telefónica developed the Network Innovation Day 2018 (#NID2018) on July 14th. This was the first annual innovation event of the network, which aims to become the annual reference event on innovation issues.

Figure 1. The first edition of the NID hosted over 15 speakers

In this edition we had the participation of great speakers, such as Chema Alonso (Chief Data Officer), Enrique Blanco (Global CTIO), and Ignacio Cirac, who presented the strategic lines of technological innovation of the company, and respond to the new challenges posed by the industry.

In this edition, Telefónica presented the innovative path that is moving towards the true transformation of the network, thanks to the application of technologies such as Artificial Intelligence (AI), Big Data, 5G and Security that have revolutionized all the processes and technical infrastructures towards the digital future of its clients.

Telefónica wants to take its customers to a digital future, providing them with all the capabilities and uses that technology can offer them. To do this, it continues to develop new capabilities that will positively impact them.

Take a look at the agenda below, to get a better idea of the topics covered throughout the event. In the following link, we share with you all the presentations of the event.

Figure 2. The opening keynote by Chema Alonso and Guillermo Ansaldo focused on innovation and client centricity.

Don’t miss out on a single post. Subscribe to LUCA Data Speaks.

You can also follow us on Twitter at: @Telefonica, @LUCA_D3, @ElevenPaths

Deep Learning vs Atari: train your AI to dominate classic videogames (Part III)A history of Lisp and its use in neural networks – Part I

Deep Learning vs Atari: train your AI to dominate classic videogames (Part III)

AI of Things 2 July, 2018

Written by Enrique Blanco (CDO Researcher) and Fran Ramírez (Security Researcher at Eleven Paths)

In this post, we will offer details about the architectures chosen for our models, the logic that the agent follows during the training, the results of the project and our conclusions. This article concludes our Deep Learning and Reinforcement Learning experiment in games generated by OpenAI Gym. If you haven’t yet read the first two parts you can do so here:

Convolutional Networks: The Archtitecture of our Model

As we have explained previously, our agent must use an appropriate control policy that allows us to satisfactorily approximate the Q(s, a) function in order to maximize the reward obtained from an action a in a state s. In order to deal with the complexity that results from combining many complex states and to approximate the function, we need to apply Reinforcement Learning (RL) algorithms to Deep Neural Networks (DNNs). These networks are also known as Deep Q-Networks (DQNs).

For this type of training, the best neural networks to use are those called “Convolutional Neural Networks”. Throughout the history of Deep Learning, these networks have proven to be architectures that behave excellently when recognizing and learning patterns in images, as in the case of this White Paper.

Deep Neural Networks take the pixel values of the frames it receives as the input data. Generally, a Deep Neural Network begins with a layer that has a similar number of dimensions to the input data, and ends with a layer that reduces the number of dimensions to the number of the action space. The representations of the entries will become more abstract as they go deeper into the architecture, and finish in a dense final layer with a number of outputs equal to the action space of the environment (4 in the case of Breakout-v0, 6 for SpaceInvaders-v0).

By having an architecture with various layers, we are able to extract structures that are difficult to identify in complex entries. One should carefully choose the number of layers and dimensionality of the architecture. As we will show later, using an increased number of layers in a model can become counterproductive if we are seeking to optimize the training time.

Although one would normally place pooling operations between the convolutional layers, one should note that in this case they have not been included within the first three layers. This is due to the fact that, when including them between convolutional layers, the representations that the models learns do not adapt to the spatial situation, since it is difficult for the network to determine the location of an object in the image. This characteristic is helpful when the location of the object in the image is not completely important (as is the case when identifying images). However, in our case, the relative location of the objects in the game is a vital factor when determining which action to take in ordeer to maximize the reward.

For Breakout-v0, an architecture with various convolutional layers collects the state returned in the previous layer, by applying an activation function with a rectified linear unit (ReLU).

The first convolutional layer is made up of 16 filters with a 3×3 kernel and a stride of 2
The second convolutional later has 32 filters with a 3×3 kernel and a stride of 2
The third convolutional later increases to 64 filters with a 3×3 kernel and a stride of 1

Then we included the following layers, with the same activation of the convolutional layers, except the final layer:

A dense, fully-connected layer of 1024 units
Another dense, fully-connected later of 516 units
A final exit later with 4 units (one unit for each possible action)

For SpaceInvaders-v0, we used a similar convolutional architecture to the one used in Breakout-v0

The first convolutional layer is made up of 16 filters with a 3×3 kernel and a stride of 2
The second convolutional layer has 32 filters with a 3×3 kernel and a stride of 2
The third convolutional layer increases to 64 filters with a 3×3 kernel and a stride of 2

With the same activation function as the convolution layers (except the final layer), we also have:

A dense, fully connected layer with 516 units
A final exit later with 6 units

The hyperparameters of this model are the same as those used for the Breakout-v0 environment.

Logic of the Training of the Agent

Broadly speaking, the algorithm we have used follows the following steps:

Start up all the Q-Values around zero. This generates the model, calling on the class of neural networks that is dedicated to estimating the Q-Values for a determined image of the game.
Generate a game state from the class dedicated to preprocessing the images of the game environment. The structure of this class depends on the chosen strategy. If you are using a single image of the environment, it will be impossible to determine both the direction of movement and the speed of the ball and the paddle. An immediate alternative is to consider including a second processed image in the game state which shows the traces of the most recent movements in the game environment. Another, more instant alternative, involves stacking the previous four images of the game in the state, which aims to allow the agent to deduce the direction, speed, velocity and acceleration of the elements in the game environment. Regardless of the chosen strategy, the aim of this class is to generate a state with which one can “feed” the model in order to obtain the Q-Values.
Once the input is generated, it can be introduced into the model.
Take either a random action with probability epsilon (ϵ) or one based on the highest Q-Value. This control policy is defined in the EpsilonGreedy class.
Add the state obtained in the second step, the action taken, and the reward obtained to the ReplayMemory.
When the ReplayMemory is full or sufficiently full, it works back in the memory and updates all the Q-Values according to the rewards obtained.
Carry out an optimization of the model by taking random batches from the ReplayMemory in order to improve the estimation of the Q-Values. This prevents overfitting during the initial phases of the training, and guarantees an efficient mapping of all the possible states of the environment.
Save a checkpoint of the model.
Introduce a recently pre-processed image into the model and repeat from step 3).

*Figure 1: Logic of the agent during the training stage in OpenAI Gym*

Results of the Training

In order to compare the experience of our agent, we have taken (with a modified architecture) an alternative training carried out according to the first approximation of [4]. This model was trained up to 1.2e8 episodes, and achieved a very good agent performance.

For the second approximation of the input to the model, we proceeded to log, for each episode of the training, the following information:

the reward obtained for each episode
the average reward of the last thirty episodes
the evolution of the Q-Values estimated by the model, including the maximum value of the Q-Values; the minimum value of the Q-Values; the average estimation of the action; the typical deviation of the Q-Values

In the same way, we stored all the values of the relevant hyperparameters before starting to optimize the network:

ϵ
learning rate
the loss-limit allowed for each optimization of the network
the maximum number of epochs allowed to optimize the model within
percentages of states in the ReplayMemory that produced bad estimation of the Q-Values

The trend of the average Q-Values for a given number of episodes is shown in the following graph. It presents the two models as separate lines so that one can compare the two. As can be seen, the average Q-Values for each model is similar, albeit they increase slightly quicker when using of “4 Frames Stacked” model of data entry. The same happens with the evolution of the average reward and the time of the training. The learning of both models is similar and end up converging towards the end of our training.

*Figure 2: Graph showing the average score of the previous 30 episodes (Breakout-v0)*

*Figure 3: Average Q-Values (Breakout-v0)*

In terms of the scores for SpaceInvaders-v0, one can see how the agent learns during the first 2e4 episodes, but from then on, the learning remains around 300 points on average.

The following graph represents the average Q-Values plotted against the number of episodes. One can see a rapid growth up to 8e3 episodes. This trend of growth continues, but less rapidly after the first phase of the training.

*Figure 4: The average score of the previous 30 episodes (SpaceInvaders-v0)*

*Figure 5: Average Q-values (SpaceInvaders-v0)*

The graph above (figure 5) gives us a clearer idea of the difficulties that our agent had when learning to manage this environment. As can be seen, the evolution of the percentage of states from the ReplayMemory that led to an incorrect estimation of the Q-Values is not as good as was expected. At the beginning of the training, this error percentage reached 72% and it scarcely decreases as the agent explores the game. It is true that the relative drop from this point is drastic once the learning rate stabilizes and the decision-making policy becomes less random. However, the fact that the error rate does not fall below 50% does not inspire much confidence in the prediction capabilities of the model.

After letting our model train in the Breakout-v0 environment for almost 4e4 episodes, we consider the network to be sufficiently trained for us to proceed to test the capacity of the agent. We carried out 200 test episodes, with ϵ = 0.05, aimed at minimizing the random nature of the actions to carry out. The maximum score that our AI obtained during the tests was 340 points, and the highest during the entire training was 361 points (obtained in episode 6983). These very high scores are achievable when out agent manages to open a tunnel in the layer of bricks; a strategy that would likely be used more often were we to advance the training. In the video below, you can watch the agent achieve its highest score.

https://www.youtube.com/watch?v=TNAyvU81wMQ

For SpaceInvaders, we left the model training for over 5e4 episodes, we decided to test the capabilities of the SpaceInvaders-v0. We launched 200 test episodes, with ϵ = 0.05, as it was in the case of Breakout-v0.The maximum score obtained was 715 points, and during the entire training, it managed to score 1175 in episode 15187. Below, you can see an example of what our agent was capable of achieving in this environment.

https://www.youtube.com/watch?v=D4WfTLJ0Q2I

Conclusions

We have shown how it is possible to train an AI or agent in the Breakout-v0 environment generated by OpenAI Gym. After a long training period, making use of 4 consecutive frames as the entry data, we have been able to achieve an acceptable score for an AI in the environment, even surpassing the average scores obtained by other, well respected, models.

We have reached the conclusion that it is advisable to carry out the training on a machine with a fair amount of memory and with a GPU.

We decided to train the agent in the SpaceInvaders-v0 environment. For this final environment, we reduced the size of the architecture but followed the same input strategy that was used in Breakout-v0. In this case, the results were not bad, since the agent managed to score an average of 310 points, but the improvement over time was not as notable as it was in the case of Breakout-v0.

This brief project has left certain question open, and some possible improvements have been noted, including:

It would be helpful to research architectures that are more efficient in order to accelerate the convergence of the solution. We have observed that accumulating an excessive number of layers slows down the training, despite equipping out machine with a GPU. One could explore the possibility of reducing the number of layers, in particular the dense layers at the end of the architecture
On could change the focus of the architecture, and make use of Double DQN (DDQN)
The modifications to the architecture and the values of the hyperparameters have been minimal when testing the model in SpaceInvaders-v0. We ought not forget that Space Invaders is a much more complex game that Breakout, where the number of events that can cost you a life is higher, and you also start with two fewer lives
One could explore new hyperparameter values e.g. reducing the batch size, discount factor, starting point of the learning rate etc
One could investigate the effect of modifying the reducing of epsilon (ϵ) value, making it slower
From the experience obtained in Breakout, it would be helpful to improve the control policy, removing the option to “FIRE” (action 1). This could save training time and avoid skewing the learning
For both environments, during the training, one could set a minimum percentage of actions that the agent may take
In SpaceInvaders-v0, one could eliminate the actions “NOOP”, “RIGHTFIRE” and “LEFTFIRE”, with the intention of improving the exploration of the environment and accelerating the learning process
One could attempt more aggressive pre-processing techniques, in particular for games that are more complex. There is a pending alternative to study, which could give good results and accelerate the processing speeds (even allowing us to include a larger number of frames in an individual state). This alternative is the use of Principal Component Analysis (PCA) to speed up the Machine Learning Algorithm. The application of this alternative would allow us to drastically reduce the dimensionality of the input, making it possible to reduce the number of layers and the size of each layer in the neural network.

Don’t miss out on a single post. Subscribe to LUCA Data Speaks.

You can also follow us on Twitter at: @Telefonica, @LUCA_D3, @ElevenPaths

LUCATalk Recap: 6 challenges for Artificial Intelligence’s sustainability and what we should do about itDo not miss what happened at NID 2018: WE ARE IN

LUCATalk Recap: 6 challenges for Artificial Intelligence’s sustainability and what we should do about it

AI of Things 28 June, 2018

<If you're interested in Artificial Intelligence, you already know that it has many benefits and successful uses. It can improve the diagnosis and treatment of cancer, optimize the management of natural disasters and catastrophes, improve education, and help with automatic transaltion. With more people embracing this technology, it's also important to be aware of the issues surrounding it, before we are faced with negative consequences.
When using AI and Machine Learning, one of its subfields, fair practice must be ensured to avoid problems.

You can find the complete webinar below, straight from our YouTube channel:

The 6 challenges discussed in depth during this session were: Non-desired effects, Liability, Unknown consequences of AI automation, Relationship between people and robots, concentration of power and wealth and Intentional bad uses. Dr. Richard Benjamins covered a series of challenges and explained what is being done to deal with them, and some examples to illustrate their impact.

1. Non-desired effects

In one of the first examples, we observed how racial bias was one of the challenges that can be faced with AI. In this particular example,an automatic system was used to rate two people charged with petty crime, as more risky or less risky to commit another one. The data rated an African American person as more risky, and a Caucasian one as less risky, even though in this particular case the reality was the opposite. The conclusion was that certain data used to train the ML algorithm had bias.

Another non-desired effect has to do with privacy, current AI works with data, however if this data is private there can be risks, of which you can read about more in depth here.

2. Liability
One of the major challenges in liability are autonomous learning and self learning systems, these systems are designed to take decisions on their own and they learn over time, however there might be a point in time when designers cannot predict what the system will decide, and if the outcome is negative, who is responsible for the consequences? An example of this is a self-driving car, which is an autonomous and self learning system. One of the proposed solutions is to create a monitor for these systems, and a law and regulation for them.

3. Unknown consequences of AI automation
One of the main concerns is the workplace, will we still have jobs? many people fear that all jobs will become automated, and even though we cannot truly know what will happen in the future, a takeaway of this is that the nature of many jobs will change, some tasks will become automated rather than jobs themselves, and new jobs will likely surface.

Taxes is another area that has received attention, specifically if there be enough tax collection when certain jobs dissapear. Bill Gates suggested creating a robot tax, which was rejected, but discussions continue to find a solution for welfare and governments to have enough funds to still help their citizens.

Figure 1. Japanese government hopes that by 2020 four in five care recipients have support from robots.

4. Relation between People and Robots
The relationship between robots and people is a trending topic, as many fear that robots will take many of our tasks and jobs as mentioned above, but there are also positive sides to this relationship.
The first example is robot caretakers. In Japan, many hospitals have implemented robot caretakers, and robots to help elderly people feel less lonely and live happier lives. It has to be mentioned however, that Japanese society is more advanced in the sense that they have had more contact with robots, have robot themed cafe’s and restaurants, and see this more frequently than in other areas of the world. Someone even married a robot!

Many articles mention how people leave bad managers and not bad jobs, which touches on the idea of robot managers. Some have suggested that in order to avoid preferences and bias at work, robots could become managers, and help people stay in jobs they would otherwise leave. This however is still under lots of dicussion, but the idea is out there.

5. Concentration of power and wealth
There are three main challenges when dealing with power and wealth, economic impact, danger of bias and AI as a service.

The first is Economic impact, can other companies compete one day with the powerhouses of the US and China? In the US and China huge companies have massive amounts of data. These companies are: Google, Amazon, Facebook, Apple and Microsoft in the US and Baidu, Alibaba and Tencent in China. This makes it hard as companies with less data, will not be at the same level, and wealth inequality will continue to happen. The second is Danger of bias, no one knows if data coming from these companies has a degree of bias, and if it’s fully representative of all genders and populations. Amazon for example, offers facial recognition for Police departments, and imagine if the data is not representative and algorithms are trained with this data, the results will be more negative than positive. Last but not least, AI as a service that is still a black box, how to explain these results and what to do with accountability.

6. Intentional bad uses

Any technology can be used for good, but also for bad. malicious use is one of the risks of massively applying AI. The Malicious Use of Artificial Intelligence report: Forecasting, Prevention and Mitigation has identified three: Digital Security, Physical security and Political security. Cyber attacks to critical infrastructure would qualify as digital security, hacking into government systems for example. Self-driving cars or autonomous drones would go into physical security, without the proper use and control could be used as weapons if they are hacked, and the use of mass surveillance and fake news are examples of Political security.

It is important to understand these challenges, but consider that AI is used for good, it can help with natural disasters and catastrophe relief, and to improve many processes. It’s also key to identify
marketing and advertising, from critical systems when thinking about applying rules to all applications of AI. If you receive a biased response related to your gender or race, it’s not the same as received an automated ad that does not correspond to your preferences.

One of the questions asked during the live Q&A was about AI in Spain, and what about Spain in regards to this technology, as France, the UK and the EU, were mentioned during the webinar. At the moment, Spain is in the process of writing the “White Book” about AI, and where are the strentghs of applying AI, and what are the ethical issues surrounding it.

Don’t miss out on a single post. Subscribe to LUCA Data Speaks.

You can also follow us on Twitter at: @Telefonica, @LUCA_D3, @ElevenPaths

Lessons learned from The Cambridge Analytica / Facebook scandalDeep Learning vs Atari: train your AI to dominate classic videogames (Part III)

#CyberSecurityPulse: New proposal to adapt U.S. Marine Corps capabilities to the new times

ElevenPaths 26 June, 2018

The head of the U.S. Marine Corps wants to remodel his team. The Marine Corps is considering offering bonuses and other benefits to attract older, more experienced Marines to re-enlist and develop cybersecurity capabilities as well. The measure marks a historic change that could transform a force composed primarily of high school graduates. “It’s going to be a little bit older, a little bit more experienced because as much as we love our young Marines, we need a little more age because it takes time to acquire these kinds of skills”, General Robert Neller told defense leaders at a conference in San Diego.

The 2018 defense budget earmarked money for the Marine Corps to add 1,000 Marines, many of whom will work in cyberwarfare and electronics. The manipulation of the networks that control air defence operations, for example, could be equal to or more lethal than the firepower in the future. Extremists have also been able to use mobile technology and social media to recruit members and raise money to become a real threat.

The Marine Corps will open up these kinds of jobs this October. However, this new occupational field does not avoid the fact that it is subject to the rigors of physical training. On the other hand, the Marine Corps is also developing plans to recruit and retain professionals from the cyberspace in the reserve, and in May unveiled new badges for the enlisted troops and officers working as remote-controlled aircraft operators. “These measures are going to change the Marine Corps and the way we fight”, said Neller.

More information available at Marine Corps Times

Highlighted News

Apple just banned cryptocurrency mining on iOS devices

anti-doping imagen

Apple has added new language to its App Store review guidelines related to cryptocurrency. Under the Hardware Compatibility section, Apple now states that “apps, including any third party advertisements displayed within them, may not run unrelated background processes, such as cryptocurrency mining”. As of late May, the only mentions of cryptocurrencies in the guidelines were that apps were allowed to facilitate such transactions “provided that they do so in compliance with all state and federal laws for the territories in which the app functions”. But Apple’s new policy seems to go beyond obviously abusive cases of surreptitious cryptocurrency mining. The guidelines ban any on-device mining—even if users deliberately download an app whose explicit purpose is to mine.

More information available at Arstechnica

Microsoft reveals which bugs it won’t patch

EI-ISAC imagen

Microsoft has put out initial clarification around which bugs it will rapidly patch, and which ones must wait for a new product release – and which ones it won’t address at all. In a draft document posted online on Tuesday, the software giant laid out the criteria that the Microsoft Security Response Center (MSRC) uses when deciding what to patch and when. There are two litmus tests that broadly guide these decisions, as the company explained in the document: “Does the vulnerability violate a promise made by a security boundary or a security feature that Microsoft has committed to defending?”. And secondly, “does the severity of the vulnerability [as determined by Microsoft’s five-tier rating system] meet the bar for servicing?”. The “bar for servicing” in Microsoft parlance means that the flaw is rated Critical (i.e., allowing for remote code execution) or Important (privilege escalation, information disclosure, security bypasses and RCE), according to the document details. If the answer to both questions is yes, then the prescribed action is to issue a patch, either on Patch Tuesday or, in rare cases, in an out-of-band release. If the answer to either question is no, then the bug is relegated to back-burner status in most cases, with a fix coming in a subsequent release of the product or service.

More information available at Windows

News from the rest of the week

macOS still leaks secrets stored on encrypted drives

A macOS feature that caches thumbnail images of files can leak highly sensitive data stored on password-protected drives and encrypted volumes. The automatically generated caches can be viewed only by someone who has physical access to a Mac or infects the Mac with malware, and the behavior has existed on Macs for almost a decade. Still, the caching is triggered with minimal user interaction and causes there to be a permanent record of files even after the original file is deleted or the USB drive or encrypted volume that stored the data is disconnected from the Mac.

More information available at Objetive-See

Google to fix location data leak in Google Home, Chromecast

Google in the coming weeks is expected to fix a location privacy leak in two of its most popular consumer products. New research shows that Web sites can run a simple script in the background that collects precise location data on people who have a Google Home or Chromecast device installed anywhere on their local network. “The only real limitation is that the link needs to remain open for about a minute before the attacker has a location. The attack content could be contained within malicious advertisements or even a tweet”, researcher told KrebsOnSecurity.

More information available at Krebs on Security

Android gets new anti-spoofing feature to make biometric authentication secure

Currently, the Android biometric authentication system uses two metrics borrowed from machine learning (ML): False Accept Rate (FAR), and False Reject Rate (FRR). In Android 8.1, they introduced two new metrics that more explicitly account for an attacker in the threat model: Spoof Accept Rate (SAR) and Imposter Accept Rate (IAR). As their names suggest, these metrics measure how easily an attacker can bypass a biometric authentication scheme. Spoofing refers to the use of a known-good recording (e.g. replaying a voice recording or using a face or fingerprint picture), while impostor acceptance means a successful mimicking of another user’s biometric (e.g. trying to sound or look like a target user).

More information available at Blog de Google

Other news

North Korea’s new trojan is called Typeframe

More information available at US Cert

Google developer discovers a critical bug in modern web browsers

More information available at The Hacker News

Magento credit card stealer Reinfector allows reinfect sites with malicious code

More information available at Security Affairs

Hackers steal $31 million from South Korean cryptocurrency exchange Bithumb

More information available at Bithumb

ElevenPaths Announces Strategic Security Alliance with Devo#CyberSecurityPulse: Private enterprise’s sad contribution to sharing threat intelligence in the United States

Lessons learned from The Cambridge Analytica / Facebook scandal

Richard Benjamins 25 June, 2018

It has been now some time that the Cambridge Analytica / Facebook scandal was first revealed on March 17, 2018 by The Guardian and the New York Times. Much has been written in the press since then about this scandal. Since then, Cambridge Analytica has closed its business, Facebook lost billions of market value and Mark Zuckerberg was summoned to appear in the US senate and the European Parliament to answer all kinds of questions about this case and about general privacy aspects of Facebook.

Part of the reason that the situation has exploded into the scandal is that it might have influenced, in a so-far unknown way, the 2016 American Elections, and also the Brexit vote. Facebook suspended Canadian data firm AgreggateIQ from its platform, due to its involvement with Cambridge Analytica. Nobody knows yet whether this scandal finally will be the explosion of the privacy time bomb with profound impact on the data industry, or, whether some time after the storm, everybody forgets about it and life goes on like before.

Figure 1. The 2016 US Presidential elections were intertwined with Cambridge Analytica

But what is it exactly that happened and why has it become such a large scandal? And is what has happened exceptional? Or are similar things happening all the time, but they go by unnoticed?

In this post, we will analyse step by step what has happened and then compare this to how the Obama administration used Facebook during his 2012 campaign. We leave it then to the reader to make up his or her mind.

The steps leading to the scandal have been amply described, so we will just summarize it here.

· In 2013, Cambridge University researcher Aleksandr Kogan and his company Global Science Research created an app that asked users questions to establish their psychological profile. The app also asked permission to access the users’ Facebook information including that of their friends. About 300000 users reportedly agreed to use the app in return for a small economic compensation. Through those 300000 opted-in users, Kogan got access to information of tens of millions of users. Using data science, the 300000 users allowed for establishing a relation between Facebook data and psychological profiles, which was then extrapolated to the tens of millions of users. Those profiles could then be used for “political advertising” or for influencing voting behavior.

· Kogan reportedly sold the profiles of those tens of millions of users to Cambridge Analytica.

· Trump’s campaign team hired the services of Cambridge Analytica to launch targeted Facebook ads to influence the voting behaviour for Americans against Clinton and in favour of Trump.

· On March 2018, Whistleblower Christopher Wylie, former employee of Cambridge Analytica revealed to The Guardian and the New York Times about the activity of Cambridge Analytica and how they got access to the psychological profiles of tens of millions of American citizens, which were then used for the US elections.Two other events are important in understanding what has happened and what went wrong.

· In 2014, Facebook changed its API so that consent of individual users did not extent anymore to their friends. That is, a user could still give consent for apps to access his or her personal information, but not for the information of their friends. However, Facebook did not apply this policy retroactively.

· In 2015, through a publication of the Guardian, Facebook learned about the Kogan/Cambridge Analytica relation for political influencing through Senator Ted Cruz’ campaign against Trump in the elections for the republican candidate for the 2016 US elections. Based on this publication, Facebook asked Kogan and Cambridge Analytica to delete the data as it violated their T&Cs. Facebook claimed that both Kogan and Cambridge Analytica certified that the data had been deleted.

iPhone with facebook application zoomed in

Figure 2. Tech stocks including Facebook’s, took a hit after the scandal came to light

What did Facebook do wrong?

In my opinion, Facebook made two mistakes:

1. The fact that, through their API, they not only gave access, after an opt-in, to a specific user’s information, but also to the information of their friends. It now seems strange that one person can give permission to access the personal Facebook data of 300 other persons, even if those 300 persons are “friends”. In a world where “privacy was no longer a social norm” this may have seemed normal, but now we know it is not. Notice that this phenomenon is something that is coming back in the GDPR with the “right to data portability” as we will see later in this post.

2. When Facebook learned about the data transfer from Kogan to Cambridge Analytica through The Guardian10, and asked both parties to delete all data, they did not sufficiently check whether this had been done. They were satisfied with a letter stating that and did not require more serious measurements.

Where was the violation of the law?

The real violation of the law has been in Kogan transferring the data to Cambridge Analytica, and thereby violating the terms and conditions of the Facebook API.

Facebook and Obama’s re-election in 2012

While through this scandal and all the issues around Fake News the use of Facebook for influencing important world events is now questioned, Obama was praised for using Facebook and social media for his re-election campaign in 2012. However, while there are some differences – in the end, Obama didn’t violate the law – there are many commonalities.

People who wanted to contribute to Obama’s re-election were encouraged to organize and/or notify all their activities through logging in on the Obama website or using Obama’s campaign App, using Facebook connect. This would result in the person consenting to inject his or her personal Facebook data (home location, date of birth, interests, network of friends) into a central Obama campaign, along with all the personal data of their friends. Once stored in the central Obama database, all this data was then combined with other voting data available, so Obama could send targeted political adds to people who they believed could be mobilised to vote for Obama.

While Obama was transparent to encourage people to log in onto the campaign website, or use the App, with Facebook connect, it remains to be seen whether the volunteering individuals were aware of what happened to their personal data, let alone to the personal data of their friends.

Obama exploited the same Facebook API as Kogan, which at that time was publicly available for any developer. Not Obama nor the press anticipated the far-reaching impact this action had on people’s privacy. Another question is whether they should have realized this… But the press praised Obama for pioneering a successful digital-first presidential campaign. But -as we have seen- while for Trump’s election the data used, was obtained illegally, for Obama’s re-election, the data was obtained in a legal way, complying with the T&Cs of Facebook’s API.

Both campaign teams then used the Facebook ad platform to send targeted messages to clusters of voters. But there is also a difference in how Facebook was used. Using all the profiling information, Obama sent political adds on Facebook to clusters of people who the algorithms thought could be mobilised to vote for Obama, and many messages were sent by the supporters themselves. The Trump campaign team distributed targeted stories on Facebook to mobilise potential voters, but also distributed stories to discredit the opponent, Clinton, and sometimes those stories were claimed to be untrue (Fake News).

The table below summarizes the commonalities and differences between the use of Facebook in the Obama and Trump campaign.

Data	Obama campaign	Trump campaign	Comments
Consent for use in election	Through Obama App, or login on campaign website with Facebook Connect	No consent for this usage, but for scientific research
Access to	Individual & friends’ data	Individual & friends’ data	Both Obama and Trump exploited Facebook’s Open Graph API
Usage through Facebook’s Ad platform	Centrally designed political ads and volunteering user messages.	Centrally designed political ads and, reportedly, stories to discredit Clinton	There is debate on whether Cambridge Analytica spread “Fake News” to influence the elections.

Lessons we should learn from this

As said, some are considering that we are living on a privacy time-bomb. Could this scandal be that bomb that will change the data industry forever? No one yet knows. On the positive side, this scandal has helped that people and societies have become more aware of the use of personal data for advertising; even though this particular scandal is related to political advertising, the techniques are similar for general online advertising. The lesson for people is that we must be more careful when granting consent for personal data usage to companies whose services are for free: “If You’re Not Paying For It, You Become The Product”.

Another lesson we can learn is that people should not be able to give consent for usage of data of the people they communicate with. Only if both agree, consent should be considered given. This is however easier said than done. For example, the GDPR gives citizens a new right to “data portability”, where a user can ask any of her or his service providers for a copy of personal data or can ask to transfer (port) that data to another organization. But what happens when third users are included in this personal data? A bank transaction always includes an origin and destination user/organization. Likewise, a telephone communication includes a caller (the user) and a callee (the destination). Is it allowed to port data on/about the “destination”? Or in Facebook, if I port my Facebook data to Linkedin, should I be allowed to convert my “friends” into “connections”? Or should the destinations (receiver of transaction, callee, friends, etc) be anonymized? Or asked for consent? The Information Commissioner’s Office of the UK gives some advice, but this is not enough in case data portability will happen massively.

Keep up to date with all things LUCA! check out our website, and don’t forget to follow us on Twitter, LinkedIn and YouTube.

Don’t miss out on a single post. Subscribe to LUCA Data Speaks.

Deep Learning vs Atari: train your AI to dominate classic videogames (Part II)LUCATalk Recap: 6 challenges for Artificial Intelligence’s sustainability and what we should do about it

Deep Learning vs Atari: train your AI to dominate classic videogames (Part II)

AI of Things 22 June, 2018

Written by Enrique Blanco (CDO Researcher) and Fran Ramírez (Security Researcher at Eleven Paths)

In this article, the second about our experiment using Reinforcement Learning (RL) and Deep Learning in OpenAI environments, we continue on from the previous post that you can read here if you haven’t done so already. This post presents the results obtained after training our agent in the Breakout-v0 and SpaceInvaders-v0 environments. Before continuing, you may want to also catch up on our recent webinar in which we went into more detail about the results you will read about in this blog.

Introduction

Reinforcement Learning (RL) is the area of Machine Learning used to train artificial intelligences to play videogames in environments developed in OpenAI Gym. It is capable of providing an agent with algorithms that allow it to examine and understand the environment that it is working in in order to achieve an objective in exchange for a set reward. These algorithms help the agent to learn, through trial and error, to maximize the reward that it can obtain based on the variables that it observes in the game, all without needing human intervention.

Below, we briefly define some of the common concepts of Reinforcement Learning (RL):

Environment: this describes the game in which the agent must act and learn to develop.

Reward: the incentive that the agent obtains after carrying out a determined action. In the case of Breakout-v0, the agent receives a positive reward when it manages to return the ball and destroy one of the bricks.

State: this is usually a tensor obtained from the observation space of the environment. In this case, the states consist in a collection of preprocessed images with the aim of helping to train the model.

Action: this is a possible move in the action space that the agent can carry out, based on the current game state or the historic states that it has studied. For example, in our case it would be to move left, right or stay still in terms of direction, and to shoot the ball

Control policy: this determines how the agent chooses the action that it will take. The programmer can choose the control policy at the time of carrying out the training of the neural network. Normally, you can choose a random action to start with, and once the model trains sufficiently, it will act based on the maximum value that the model has obtained up to that point.

Figure 1: Diagram showing the learning process of an agent during the training.

Beginning the Training

The algorithm used in this paper, which we will explain in the following sections, aims to maximize the reward each time. The agent recognizes images of the game environment and adds them to a neural network, which will allow it to estimate the best action to take based on the input data. We will use the TensorFlow library to build the architecture of the deep network as well as to make the relevant calculations.

The values of the actions that the model estimates from a given input are normally referred to as Q-Values. When an agent knows these values beforehand, it only has to select the action that maximizes the corresponding Q-Value for each game state that it observes. However, these Q-Values should be explored through an extensive training process, due to the large amount of possible states that can occur.

Control Policy

At the start, the values of the actions start at zero, allowing the agent to take random actions in the game. Each time that the action returns a positive reward (destroying a brick), the weights and biases of the layers of the model’s architecture update, which means that the estimation of Q-Values becomes increasingly refined.

When approximating the map of different states and actions, Reinforcement Learning techniques are often quite unstable when using a deep neural network. This is due to the nonlinearity of neural networks and the fact the small changes in Q-Values, when there is an inappropriate control policy, can drastically change the action and therefore lead to very different game states.

Due to all this, and with the aim of reducing instabilities that could arise during training, one usually runs a random sample of a large number of states, actions and rewards in order to explore the greatest number of possibilities of the current casuistry and avoid divergences and blockages in the model’s training.

.gist-file
.gist-data {max-height: 500px;}

https://gist.github.com/eblancoh/625f99a75bd8c851364899705fbadf41.js

Q-Function

The objective of the agent is to interact with the emulator with the intention of learning which action to take in a given game state – or set of game states – in order to maximize the reward of said action.

A function that returns the optimum action given a certain game state is defined as:

Q(s,a) = reward(s,a)+γ · max(Q(s’,a’))

This function is known as the Bellman Equation. It shows that the value of the Q function for a given state s and an action a equals the current reward r for that state s and the action a plus the expected reward derived from a new action a’ and a previous state s’, corrected by a discount factor γ∈[0,1].

This discount hyperparameter allows us to decide how important future rewards are in relation to the current reward. Values close to γ≃1 will be better suited to Breakout, because the rewards are not obtained immediately after the action, since various subsequent actions may take place before it becomes clear whether the initial action was successful or now. In other words, it takes various frames after bouncing the ball for a brick to break.

https://gist.github.com/eblancoh/766e5eea60db51e73a21c3ee18eca0ef.js

The Loss and Optimization Functions

Given the large number of frames per second to process, and the elevated dimensionality of the game states, it is impractical to directly map the causality between action and state. This forces us to approximate the Q function through our random sample of states, rewards and actions.

Usually, the loss function chosen aims to minimize the Root Mean-Squared Error of the Q-Values that we obtain through using our model, and the expected Q-Values.

sqrt(loss) = Q(s’, a’) – Q(s, a) = reward(s, a) + gamma · max(Q(s’, a’) – Q(s, a))

In order to find the minimum of the previous function one can use the iterative optimization algorithm “Gradient Descent”. This algorithm calculates the gradients of the loss function for each weight and moves them in the direction that minimizes the function. However, finding the minimum of a nonlinear function can be complicated, especially due to the possibility of being stuck on a local minimum and not the global minimum what you want, or carrying out many iterations on a flat part of the curve.

Optimizing a neural network is a complicated task, which is highly dependent on the quality and quantity of the data with which the model trains. The complication of optimizing the network is also a result of its architecture, which consists in a larger number of layers and has greater dimensionality than usual, and will require a greater number of weights and biases.

https://gist.github.com/eblancoh/593f019e5735b810e436b7c2d25db9c9.js

Pre-Processing Input Data

One of the main deciding factors of a good training of the model, given the long computing times required, is the pre-processing of the image and the nature of the input to the neural network. This will also directly affect the routines that one needs to develop for interacting with the environment. In general, it is advisable to process the image generated by the Gym environment before it is included in the model. Generally, this aims to reduce its dimensionality, by eliminating the information that would not be useful when training the neural network. Normally, there would be an emphasis on the information relating to color that OpenAI Gym contains in its three color channels. These channels do not contain valuable information for the training of our model, and will therefore be forgotten before introducing the states to the model.

The images returned by the OpenAI Gym environment are arrays of 210×160 pixels grouped in three RGB layers. This increases the memory usage. Therefore, it is vitally important to preprocess the images in order to reduce the dimensions of the inputs, to eliminate unnecessary information and to reduce memory usage.

The tests carried out in this project are based on two approximations regarding the processing of images:

As a first approximation, we take images of the game environment and process them; making them greyscale, resizing them, removing any background and using a simple image filtering to detect movement. The resulting state of these steps is the latest image of the environment as well as recent traces of movement of the objects.

In the second alternative, we have opted for using a stack of four images as the input, with the intention of allowing the model to learn to detect movement. This is necessary since an individual state offers little information about the velocity and direction of the ball and paddle.

We are only interested in the area of the game where the ball and paddle are moving and where the bricks are. The borders of the screenshots do not offer valuable information to the model, so we eliminate these areas. Furthermore, we reduce the resolution of the image by 50% and turn it black and white (in a binary scale) since the RGB channels also offer little information of interest.

In the next post, we will offer a description of the architecture of the model with which he have trained our agents in Breakout-v0 and SpaceInvaders-v0. We will also explain in greater detail the logic of the training, explain the testing phase, and offer some conclusions about the project.

Don’t miss out on a single post. Subscribe to LUCA Data Speaks.

Business Messaging – From humble SMS to conversational commerceLessons learned from The Cambridge Analytica / Facebook scandal

Business Messaging – From humble SMS to conversational commerce

AI of Things 20 June, 2018

Written by Jenny Whelan, Head of Business Affairs – LUCA Advertising

We all know SMS. With 5 billion users, the reach of SMS as a messaging channel is unparalleled. Combined with its reliability, this is why brands continue to use SMS to connect with their customers. 2.21 trillion Application to Person messages (A2P) are sent per year.

A2P SMS is a big business and is still growing (~5% CAGR) to reach $90bn by 2021

Historically, the SMS channel has been used for alerts, for PSMS (Premium SMS), as a carrier for simple content and services, and as a billing mechanism. With the transition to an app based economy many such use cases for A2P SMS have declined, offset by the emergence of alternative opportunities such as authentication, notification, loyalty and awareness, booking and delivery information, to name a few.

*Figure 1: Infographic – SMS: The World’s Best Inbox.*

Despite the ubiquity and reach of SMS, it does have some limitations and new technologies have disrupted the messaging market. We see Messenger has 1.3 billion monthly active users (MAUs), WhatsApp boasts 1.2 billion MAUs, Slack has three million people sending messages every day, iMessage, LINE, and WeChat are also contributing to an upsurge of messaging users. These platforms were originally created as person-to-person (P2P) communication channels, but businesses are in the early innings of using these messaging platforms to engage customers and prospects. These platforms offer more features for brands and customers to interact, generating a whole new concept to messaging, being named Conversational Commerce. Although it’s early days we are seeing huge adoption and growing APRUs (Average Revenues per User).

*Figure 2: Average Revenue Per User of various messaging services (Souce: Samsung).*

WeChat, for example, has taken China by storm and is now a basic human need for any business or individual. It now has 1billion Monthly Active Users (MAUs) and a monthly APRU of approximately $7 (in 2017).

New entrants and new ways of interacting between brands and customers has meant messaging is undergoing a significant transformation. Telecommunications companies (MNOs) are impacted by this transformation and they have an important role to play in shaping the future of messaging. MNOs have to evolve their A2P SMS business; to defend revenues; to provide solutions that both brands and customers want to engage through; and to ensure accelerated movement towards new business and increased revenue opportunities.

Rich Communication Suite (RCS) will provide an update to the SMS channel

*Figure 3: A comparison of SMS and Rich Communication Suites (RCS).*

The RCS experience will involve more colour, more interaction, group chat, photo sharing, carousels etc. It will have chat bots to engage with the brands customers want to speak with as well as discovery functions to find companies to engage with.

According to a study conducted by Nielsen (in 2018), 67% of people say they will message with businesses more over the next two years, and 53% say they are more likely to shop with a business they can contact via chat.

*Figure 4: Example of a Rich Communication Suite (RCS).*

RCS is an operator service that will work on any RCS-enabled smart device or network, and will give customers the experience they’ve come to expect from OTTs, natively in their handset.

According to the GSMA, 50 operators have launched RCS in 40 markets so far, serving 159 million monthly active users. RCS launched across all operators in the Japanese market in May 2018 and we are seeing new launches constantly arise. 2019 will see the commercial launch of RCS into Business Messaging.

*Figure 5: Press announcement from GSMA (Source GSMA).*

We are living a moment of rapid transformation in Business Messaging. MNOs have an important role to play in this transformation, shifting the humble SMS experience to engaging conversation commerce (enabled by RCS), which customers and brands have come to demand. Exciting times ahead, watch this space!

Don’t miss out on a single post. Subscribe to LUCA Data Speaks.

The AI Hunger Games – Why is modern Artificial Intelligence so data hungry? (Part II)Deep Learning vs Atari: train your AI to dominate classic videogames (Part II)

ElevenPaths Announces Strategic Security Alliance with Devo

ElevenPaths 14 June, 2018

Strategic alliance ElevenPaths and Devo imagen

Provides Telefónica Customers Advanced Cybersecurity Monitoring and Protection Services Through Devo Data Operations Platform.

Madrid- Thursday 14th of June, 2018. ElevenPaths, the Telefónica Cybersecurity Unit, specialized in development of innovative security solutions, today announced a Strategic Alliance with Devo, the Data Operations Company formerly known as Logtrust. The Devo Data Operations Platform will provide ElevenPaths with end-to-end security monitoring and data management services to rapidly detect and effectively track Cyber incidents, and reconstruct incident time-lines with all forensic evidence securely stored in a central location. In addition, ElevenPaths services such as Clean Pipes and Web Application Firewall (WAF) will feature embedded applications built on top of the Devo platform for developing customer portals with real-time enriched dashboards.

The partnership between ElevenPaths and Devo provides Telefónica customers with access to a suite of security services that will help them to comply with security and privacy regulations, including GDPR, and store customer data securely in Telefonica’s data centers. Solutions include:

Security Monitoring – a self-managed service that provides customers with a comprehensive view of the security status of their cloud, IT and security resources, and enables monitoring through customized alert generation.
Data Management – a managed security service that enables Telefónica customers to collect, store, search and analyze in real-time any IT, security, network or application log data, regardless of its size or the retention period.

This partnership also enhances existing Telefónica security services, including WAF and Clean Pipes. Devo provides an extra visibility layer to the customer for these Telefónica security services, which control traffic, enable web filtering to limit exposure to malicious content and implement corporate internet usage policies, and protect against threats and unauthorized access. Customer portals for these services have been developed over the Devo platform, providing customers with real-time dashboards and reporting.

“Devo complements our existing security services, allowing us to provide Telefónica customers with real-time enriched customer portals to improve their overall security status,” said Pedro Pablo Pérez, Telefónica Global Security VP & ElevenPaths CEO. “Providing security teams with real-time operational intelligence, significantly accelerates their security analysis and forensics to address today’s rapidly expanding attack surface.”

Devo solutions delivered through ElevenPaths enable Telefónica customers to manage and retain full visibility into every aspect of their security, helping them achieve their business security objectives, keep operational costs predictable, and enable IT teams to stay ahead of security issues.

“We are excited about the partnership with ElevenPaths and the advanced security capabilities we can offer together,” said Pedro Castillo, Founder and CTO at Devo. “Combining Telefónica service offerings with the Devo Data Operations Platform provides customers with industry-leading security capabilities in an easy-to-use and easy-to-consume delivery model that scales to meet the growing performance and data demands of the world’s largest enterprises to more effectively secure their businesses.”

About Devo

Devo, formerly Logtrust, is the leading Data Operations Platform for the digital enterprise. Devo delivers real-time business value from analytics on streaming and historical data to help Fortune 1000 enterprises drive sustained performance and growth. The Devo Data Operations Platform collects, enhances and analyzes machine, business and operational data from across the enterprise. Devo provides real-time analytics and insight for IT operations, security analytics, business analytics, customer insight and log management for the world’s leading organizations. For more information visit www.devo.com, or follow Devo on twitter @devo_inc.

Press Contact

Ann Dalrymple

[email protected]

About Telefónica

Telefónica is one of the largest telecommunications companies in the world by market capitalization and number of customers with a comprehensive offering and quality of connectivity that is delivered over world class fixed, mobile and broadband networks. As a growing company it prides itself on providing a differential experience based both on its corporate values and a public position that defends customer interests. The company has a significant presence in 21 countries and over 322 million accesses around the world. Telefónica has a strong presence in Spain, Europe and Latin America, where the company focuses an important part of its growth strategy. Telefónica is a 100% listed company, with more than 1.5 million direct shareholders. Its share capital currently comprises 4,975,199,197 ordinary shares traded on the Spanish Stock Market and on those in London, New York, Lima, and Buenos Aires.

More information:
telefonica.com
@Telefonica
pressoffice.telefonica.com

About ElevenPaths

At ElevenPaths we believe in the idea of challenging the current state of security, an attribute that must always be present in technology. We’re always redefining the relationship between security and people, with the aim of creating innovative security products which can transform the concept of security, thus keeping us one step ahead of attackers, who are increasingly present in our digital life.

More information:
www.elevenpaths.com
@ElevenPaths
blog.elevenpaths.com

#CyberSecurityPulse: Changing stereotypes in the security sector#CyberSecurityPulse: New proposal to adapt U.S. Marine Corps capabilities to the new times

A world-champion IoT

Beatriz Sanz Baños 13 June, 2018

Are you ready for the soccer World Cup? If not, you should probably start getting ready: this event, which along with the Olympics is considered the most important sporting event all over the world, is drawing near.

But the spectacle isn’t limited to just sports. For some time now, this kind of event has been surrounded by a range of industries which enhance its development, and technology is a good example. Today, any event of this kind is also veritable technological extravaganza.

And within this trend, the Internet of Things is one of the stars of the World Cup, as we show in our infograph on the IoT and the King of Sports. We invite you to take a tour to learn about the several of the IoT innovations that will appear in this year’s World Cup.

1.- The entire stadium, connected

Soccer is the main event, but an entire exhibition of this kind requires external technological assistance. You are already familiar with the most frequent case: the goal line technology which FIFA uses to analyze whether or not a goal was scored.

And if you’re a romantic, this other advance may seem like an aberration, but it’s also essential: the famous video assistant referees (VARs) not only help referees to do their jobs but will also end up directly benefitting players, clubs, and fans themselves, since they will be able to enjoy the game without any outside element getting in the way of the celebration.

Furthermore, more and more huge events are being held like this one, where security is increasingly important. With alerts activated should any problems arise, stadiums hosting football games have to be completely prepared. Some stadiums, in fact, already have facial recognition technology in order to identify possible international security dangers or risks.

In Spain, we have one clear example: the stadium of Atlético de Madrid, the Wanda Metropolitano, which is the first 100% connected stadium in Europe with technology from Telefónica. This enables it to have communications and connectivity infrastructure, Ribbon board 360 technology, safety at the entrances, anti-intrusion systems, a multi-service network, and connection access points, among many other features.

2.- Wearables: Innovation on the jerseys themselves

For some time now, the players’ jerseys have ceased being just pieces of fabric. Today, jerseys, interior accessories and even bracelets are beginning to become powerful technological weapons that improve all aspects of the game.

Things started with training, when football teams began to wear jerseys that were specially designed to measure countless parameters, such as players’ body temperature, top speed, acceleration, heart rate, pulse, level of hydration and sweating… This veritable army of technology tools helps teams analyze their players’ performance and helps players reach their peak. One pioneering example of this can be found with Villarreal, one of the pioneering teams in the world to use technologies that today all the major clubs and national teams use.

But in no way does this stop with training. FIFA is already implementing and developing a range of technologieswhich not only help with refereeing but can also be applied during the games to help measure an entire series of parameters among the players themselves.

There are other examples of connected devices that can be present during a football game. The Spanish tech company headquartered in Silicon Valley Propelland knows a lot about this. It has a smart bottle that not only adapts to the way it is used but is also capable of measuring parameters like the players’ hydration level. In fact, this technology was already used in the Brazil World Cup.

3.- The apps that see it all

Gathering the vital statistics, speed, heart rate, and blood level in players’ bodies is essential, but data aren’t worth much unless they are used and processed correctly.

To do so, more and more clubs are using apps connected to wearables which transmit the data gathered and then store, process, and analyze them and draw conclusions which help plan for the future. The classic examples are apps which use a technology quite similar to what we ‘mortals’ use in our fitness bracelets that help improve soccer players’ performance.

But there are examples that go much further: if coaches are capable of predicting the situations in which one of his footballers is the fastest or reaches his peak speed or top endurance, they can make decisions on whether it is best to play them from the start, in easy games, in competitions that are extremely physically taxing, at key junctures like relieving their teammates, etc.

The AI Hunger Games – Why is modern Artificial Intelligence so data hungry? (Part II)Deep Learning vs Atari: train your AI to dominate classic videogames (Part II)

The AI Hunger Games – Why is modern Artificial Intelligence so data hungry? (Part II)

AI of Things 12 June, 2018

Guest Post written by Paulo Villegas – Head of Cognitive Computing at AURA in Telefónica CDO

Modern AI can achieve the impressive performance in perception & recognition tasks mentioned in part I because of advances in several areas: algorithm improvements (especially in the area of Deep Learning), increases in computing power (cloud computing, GPUs) and, very notably, the Internet.

The Internet is what made possible to amass the 14 million examples of images available in the ImageNet database mentioned in the previous part. In the case of supervised learning, it makes available the millions of annotated examples needed so that algorithms can learn from them. It gives a new perspective to Newton’s quote “standing on the shoulders of giants” transforming it into “standing on the sources of millions of dwarfs”

Another example is given by the famous Alpha Go case. How did Alpha Go beat Go world champions? One answer is by training with far more examples than human masters can possibly manage in their lifetime. It used 30 million moves from 160,000 actual games in a database. Then it improved by using reinforcement learning (a branch of machine learning that helps system learn by optimizing a reward function, in this case winning the game), playing against itself with again tens of millions of positions.

Figure 1. ImageNet alone has over 14 million examples of images

Its recent successor, AlphaGo Zero (or the even more recent AlphaZero, which can also play chess and shogi) seems to have overcome the restriction: it learns without data. The only data provided are the game rules, and then it uses reinforcement learning to improve its playing abilities. However the way it works is playing against itself: AlphaGo Zero played almost 5 million games against itself during its initial training. You could argue that this does not actually change the scene: what its creators have achieved is a very clever way to generate synthetic data (the plays of AlphaGo Zero against itself) to train it, but the amount of data needed is still huge. Nothing beats practice.

But although collecting great amounts of data is one of the reasons of the recent advances in Machine Learning results, it can only take you so far: there is almost always a limit to the number of training data we can obtain. And for certain tasks it is inherently difficult to come up with enough good examples. One way humans cope with that is by using transfer learning, by which knowledge learned in one task, domain or class of data can be reused in another context.

Figure 2. You have likely never seen a babirusa (Babyrousa celebensis) before, so there is no specific training in your brain for recognizing it. However we have generic training for recognizing other animal shapes and parts, so by watching (and remembering) this single image, next time you see another picture of a babirusa you will recognize it. Thanks, transfer learning. (Source: By Masteraah at German Wikipedia )

Deep Learning systems can use transfer learning too. A typical use is by employing a pre-trained network (or parts of it) for a different task. For instance, a big deep net trained for image recognition commonly starts with a few convolutional layers, which together learn a representation of the input data (image) into a higher level latent space. Additional layers then perform the desired task (e.g. classification). We could take the first layers of the trained net, hence taking advantage of the learned representation, stack together new layers on top of them, and train the resulting network for a different task. Given that a big share of the new net has already been pre-trained in the original net, and assuming the representation is also good for the new task, the training time can be reduced greatly and need far fewer examples to achieve good results. We have therefore performed transfer learning from the original network to the new one.

This approach (use a pre-trained network for a new task) is already well established as one standard procedure for image classification, though it still requires datasets of some size for the new task. Those requirements could be further reduced by using techniques such as meta-learning, in which the system learns the best procedure to learn the new task.

There is also a more extreme version of transfer learning called zero-shot learning. In this modality, it is possible to correctly identify classes without having ever seen a single instance of them. How can we achieve that? It may be possible by domain transfer (a variant of transfer learning). If I say that “a maned wolf is an animal similar to a fox but with unusually long legs”, then you may be able to identify which of the three animals in the following figure is a maned wolf without having seen a single one before.

Figure 3. Zero-shot learning: pick the maned wolf, please.
(Source image 1, Source image 2, Source image 3)

There are also machine learning techniques developed to perform zero-shot learning. Some of them employ the same domain transfer technique: by adequately mapping between visual features of objects and word semantic representations (extracted from text corpora), a system trained with dogs, frogs, fish, etc. but not with cats might be able to find cat instances, by evaluating how cat relates to other terms in the word domain, and mapping them to visual domain. Which is not very different from what a human would do.

In summary, the overall learning process used by modern machine learning processes might be using features not that far from what human brains do. And going beyond standard supervised learning, there are a growing set of tools and procedures, such as transfer learning or zero-shot learning (but also reinforcement learning or unsupervised learning) that might increase their capabilities for identifying patterns and entities in the reality around us, which is a great part (though by any means not all) of our cognitive baggage as humans. At least concerning perception.

First post of this serie: The AI Hunger Games: Why is modern AI so data hungry?(I)

Big Data Analytics, but what type?A world-champion IoT