Attention: Data leak! (In search of lost data)

Martiniano Mallavibarrena    3 November, 2022

We have been hearing about “data leaks” on a regular basis for years, both in the media and in our professional or even personal environment. The concept actually covers several different scenarios, but, in general terms, we could say that the consequences are similar and that the main lessons learned are common. In this article we are going to explain what kind of situations can provoke these leaks, their multidimensional impact and some best practices that can help us avoid these crises.

Apart from doctrine and theoretical definitions, in this sector we tend to use the expressions “data leak” or “data breach” in the same way to refer to certain situations where, for various reasons, a significant amount of data (it can be hundreds of gigabytes or even terabytes) belonging to an organisation ends up outside its control in terms of both privacy and location (the data is accessible either directly on the Internet, or because of an auction, or because it is exposed on Internet sites with restricted access but with no connection to the original organisation). Such situations are often referred to, in simplified form, as ‘data leaks’.

As an example, the INCIBE organisation defines this situation as: “the loss of confidentiality, so that privileged information is accessed by unauthorised personnel“.

Let us first look at the three main types of scenarios in which data leaks occur and then comment on the consequences that occur in all cases in this type of situation.

The first scenario: Negligence

For years now, the widespread use of cloud-based data storage services for organisations has led to an immense concentration of information in the form of millions of files classified in thousands and thousands of folders at international service providers of this type (the famous “OneDrive” or folders in “Sharepoint” or “Teams” are already part of many people’s routine).

Such services combined with the latest generation of office applications clearly and easily optimise the processing and sharing within workgroups, but at the same time generate (unintentionally) a sense of overall security that is generally true but does not include the classification of information (digital labelling of your document as containing public, internal, classified or secret information). In some environments, this classification may occur automatically (e.g., if the system detects bank account details or credit card numbers, the document is classified as confidential without asking for confirmation), but this is not the most common scenario.

A common example in many companies is that of hermetic systems containing highly sensitive financial and human resources information that “no one not entitled” can access and, on the other hand, dozens of files (almost always spreadsheets) with summaries of this information specially prepared for internal meetings and decision making that, unfortunately, are not usually classified or treated in a specific way beyond storing them in shared folders for restricted use.

Although this is not the only case, it is certainly the most representative when we mistakenly share a folder with a client, auditor or supplier using online storage services, but the control measures are not adequate and/or the information is not correctly classified. In that case, the files (maybe tens or hundreds, maybe thousands) will be exposed on the Internet and the probability that they will end up for sale on the Dark Web or shared in bulk anywhere is high.

In these cases and beyond the general consequences that we will see at the end of the article, in these specific cases, the organisation usually ends up being aware of the problem, and it is not unusual for disciplinary measures to be taken against specific individuals, most of the actions are usually aimed at deploying or reinforcing the use of specific platforms such as those known as DLP (Data Loss Prevention) or more broadly, SASE (Secure Access Service Edge).

The absence of proper classification of information in this type of situation (your manager asks you to review your team’s salary increases using a spreadsheet that is shared by email) inhibits other automatic protection measures (such as DLP-type functions) from having to use various techniques (such as searching for patterns in files using machine learning techniques) to try to maintain their level of effectiveness.

The second scenario: Insider

Another case, less likely statistically, but more lethal in terms of impact, involves employees (or any internal staff) who deliberately act against the interests of the company. This is often referred to as an “insider”.

Disloyal employees, extorted by third parties or people with labour disputes can follow this behavioural profile and generate very significant damage to organisations when they calculatedly expose or steal (and then share/sell) data to the outside world (always seeking to maximise reputational or intellectual property damage, among others), again causing a data leak.

In this case, most of the comments of the previous scenario apply, both because of the possible ineffectiveness of DLP/SASE type platforms and the lack of strict control of information classification.

If the action can be attributed to particular individuals, in this case, the consequences are usually of a criminal nature, as some types of offences, such as article 197 of the Spanish penal code, can be applied. If they are not direct employees of the organisation, penalties, cancellation of service contracts, etc. may be applied.

These types of leaks are not always known by the public or even by the organisation itself, although on occasions there have been cases of extortion in exchange for not publishing or selling the data (in the case of sensitive financial information on human resources or intellectual property, for example).

The third scenario: Security incidents

This is the best known and most common scenario, especially in cases of incidents supported by the use of ransomware (where client data is encrypted and a ransom is demanded in exchange for an encryption mechanism), the actor compromises the organisation’s infrastructure, accesses certain volumes of data (not always sensitive, most of the time they seek volume in attacks that last a few days) and before encrypting them, they exfiltrate them outside the organisation’s perimeter. While this practice is not common to all actors, it is common for many of them, offering a second pressure factor for the payment of the ransom.

Once the malicious actor has exfiltrated a certain volume of data (the techniques for doing so are diverse and fall outside the scope of this article) it will usually take a few days (perhaps weeks) before he hears about it again. The ways in which this data is made public are almost always in one of the following cases:

  • Pre-publication on some kind of “blog” (there are several famous “Happy blogs” by these actors) of the future file sharing. It seeks to increase the pressure on the victim, again aiming for the payment of the ransom.
    • If they announce it beforehand, they usually comply and after some time they usually share (on another page, usually in TOR to avoid police or judicial action) the stolen data, a sample or the whole of it, but in subsequent deliveries.
    • If, in some cases, the actor publishes the data on websites on the “shallow Internet“, the victim organisation or the law enforcement agency in charge of the case usually has the possibility to takedown the content by contacting the legitimate owners of the relevant web portal.
  • In other cases, with or without prior notice on a blog, the exfiltrated data appear on a TOR page either in “auction” mode (restricted access but the victim can see the auctioned object as a third measure of pressure) or in public access mode (mentioned above).

In all these cases, our organisation’s data (of any kind) can end up uncontrollably on the Internet.

The overall impact of information leaks

Thinking about the more general cases, a number of direct consequences of data leaks in organisations should be taken into account.

  • Legal consequences (the most popular but not necessarily the most sanctioning is the GDPR/LOPD line).
    • They apply to cases where it is certain or highly likely that personal data of EU citizens are held in such files.
      — In other regions, regulations similar to the GDPR may apply but of local or regional use (as far as their citizens are concerned), but not in the same way as the GDPR)
      — In all these cases there is a sanctioning regime that may be applicable (including financial penalties and disqualification from holding public office in cases where it applies).
    • Automated tools are usually necessary to be able to analyse hundreds of Gigabytes or even Terabytes of a leak, trying to characterise the type of data we have inside (which will be the focus of the argumentation of the data protection agency to decide on the sanction, as discussed in the previous point).
    • Contractual or NDA issues: In many cases, these data leaks contain confidential information about private companies, audits or sensitive intellectual property. This type of situation is often associated with confidential contracts covered by an NDA (Nondisclosure agreement) which, if not respected, can lead to significant financial penalties, cancellation of contracts, etc.
  • Reputational damage: In the context of data leaks, it is obvious that many people visit TOR (or monitor it with automatic tools) and profit from these situations: either by commenting on social networks (they position themselves as experts), or by alerting third parties (almost always on commission), downloading the data and trading with them, etc. In all these cases, the situation will end up in the media and, depending on the case, perhaps in the press and on TV (with a very significant deterioration of brand image). Therefore:
    • Some organisations have been tempted to pay the ransom for a ransomware incident (or for extortion by an internal insider), for example, just to avoid this situation even if they have a good recovery plan: severe reputational damage and disclosure of secrets, loss of trust of their main customers, etc., may be motivation enough.
    • Beyond the sensitive information that a data leak may contain, much other information (including personal files of users themselves at any level of the organisation) may end up being downloaded anywhere and by any individual or group, which should be taken into account again, perhaps, for communication measures, legal action with third parties, disciplinary action against clearly incompetent employees, etc.
      — In some anecdotal cases, the content of users’ personal files has been more “popular” than the actual leakage of data.
  • A mixed case that sometimes occurs is where the data leakage includes data from third party organisations. Then the leakage relating to one company A has a negative impact on others (B, C, D, etc.) which again leads to serious problems of the two previous types.

The summary of the article is clear: no organisation is free from the risks of such situations and therefore any organisation can be faced with a major data leak with press and TV coverage. Often the content of the leak is not fully known until it is shared by the actor and can be downloaded for analysis. Depending on the case, reputational or legal problems will be the most serious concerns.

A very complex situation in any case and a major risk that we all need to mitigate. We should not forget that.

Leave a Reply

Your email address will not be published.