Differences between encryption, hashing, encoding and obfuscation

Cristina del Carmen Arroyo Siruela    1 June, 2022
Projected code over a woman

There is currently a lot of confusion about the terms encryption, encoding, cryptography, hashing and obfuscation techniques. These terms are related to computer security, specifically to the confidentiality and integrity of data or information, except in the case of encryption and obfuscation.

Given the high importance of data and information, which are considered key elements in information systems, it is useful to know which mechanisms are available to protect them and in which cases one or the other should be used.

Cryptography, a methodology for information systems security

Cryptography is part of the field of cryptology, a science that is composed of fields such as cryptanalysis and steganography. Cryptography focuses on the study of the methods used to ensure that a message or information cannot be read by an unauthorised third party, i.e., to guarantee the confidentiality of information.

It is also used to prevent unauthorised access to and use of network resources, information systems, etc.

Cryptography is a methodology whose objective is to provide security in information systems and telematic networks, including among many of its functions the identification of entities, authentication and access control mechanisms to resources, the confidentiality and integrity of transmitted messages and their non-repudiation.

Message encryption

Encryption is a process of transforming data into a format different from the original. It is done using a public method, available to anyone and in most cases using a widely used standard format.

An example is the American Standard Code for Information Interchange, known as ASCII. In this standard, alphabetic characters and special characters are converted into numbers. These numbers are known as the “code”.

Encryption is not used for security purposes, as it only transforms the presentation of data from one format to another, without using any key in the process, and using the same method or algorithm to encrypt and decrypt the data or information.

This process was born in response to the need to transmit information over the Internet using standards that would allow the interpretation of the data or information by different environments, programmes and other elements.

Examples of encryption are the use of ASCII, UNICODE, MORSE, Base64 and URLEncoding tables.

Using mathematical functions; hashing

The hash function is the cryptographic process by which a unique string of characters is obtained through a mathematical function. This mathematical function or hash is at the core of the algorithm, which is capable of transforming any arbitrary block of data into a character string with a fixed length.

The length of the resulting string will always be the same size, regardless of the length of the input data, as long as the same hash algorithm is used. Examples of hash functions are MD5, SHA1, SHA-256, etc.

In the following image you can understand how, depending on the input, and according to the hash algorithm applied (in this example SHA1), the digest or output will be in one way or another.

If, for example, we were to use SHA-256, in all the above cases, the output would be of a fixed length, in any case, and independently of the length of the input, of 256 bits and 64 characters, although the digests would be totally different.

To consider that a hash function is secure, it must meet these 3 properties:

  • Collision resistance: It must be unfeasible for any two different inputs to produce the same hash as output.
  • Pre-image resistance: Must meet the improbability or very low probability of “reversing” the hash function (finding the input from a given output).
  • Resistance to second pre-image: Unfeasible to find a collision, i.e., the same hash cannot exist for different inputs.

Hash functions can be used in multiple use cases, some examples include the following:

  • Specific searches for information in large databases.
  • Analysis of large files and data management.
  • In message authentication, digital signatures and SSL/TLS certificates.
  • Generation of new Bitcoin addresses and keys in the mining process.

What is data encryption?

Data encryption is the process of converting text or data in readable form into unreadable text or data, known as encrypted output.

Encryption is based on the application of an algorithm using a key or master key that allows the transformation of the structure and composition of the information to be protected, in such a way that, if this information is intercepted by a third party, it cannot be interpreted or understood, i.e., it is unreadable.

Lock in a door
Photo: Maxim Zhgulev / Unsplash

When data has been encrypted, only those who have the key that allows decryption will be able to carry out that action, allowing access to the data in a readable format.

Therefore, this mechanism has a focus primarily on protecting confidentiality.

The use of complex cryptographic keys makes such encryption more secure, making it more difficult for cyber-attacks, brute-force or otherwise, to be carried out on them.

The 2 most common encryption methods are symmetric encryption and asymmetric encryption. The names refer to whether or not the same key is used for encryption and decryption:

  • Symmetric encryption keys: Also known as single key encryption. Its main characteristic is the use of the same key for both encryption and decryption, making this process more convenient for users and closed systems.

On the other hand, the key must be available to all interested parties and distributed through secure mechanisms. This increases the risk that it could be compromised if intercepted by a third party such as a cybercriminal, unless it is encrypted with an asymmetric key, which is the usual practice. This method is faster than the asymmetric method.

  • Asymmetric encryption keys: in this type of encryption, 2 different keys (public and private) mathematically linked together are used. The keys are basically large numbers linked together, but they are not identical, hence the term “asymmetric”.

The owner keeps the private key secret, while the public key is shared among authorised recipients or made available to the general public. The encryption process is therefore carried out with the public key, and the decryption process with the recipient’s private key.

Encryption is used in many cases, some of which include the following:

  • Encryption of voice communications.
  • Encryption of banking and credit card data.
  • Database encryption.
  • Digital signatures, for verification of the authenticity of the origin of the information.

Obfuscation

The purpose of obfuscation is to make something more difficult to understand, usually for the purpose of making it more difficult to attack or copy.

Photo: Markus Spiske / Unsplash

This mechanism is commonly used to obfuscate the source code of an application in order to make it more difficult to replicate a given product or function. This mechanism is not a strong security control, but it is a hindrance to making something more unreadable, helping to make reverse engineering more difficult.

It is often reversible, like encryption, using the same technique that was used in obfuscation. Other times it is simply a manual process that takes some time.

Some applications that help with this process, although it is always recommended to do it manually, are JavaScript Obfuscator, and ProGuard.

Featured image: Pexels / ThisIsEngineering.

Leave a Reply

Your email address will not be published.