Few weeks ago, we presented DIARIO, the malware detector that respects the privacy of users, and we continue to improve it so that it detects more and better. We recently added the ability to detect malware in office documents which macros use a technique known as VBA stomping. What is this technique about and why is it so important?
We already know that emails with attachments are one of the most popular entry routes for malware, specifically office type attachments. This is largely possible due to the ability to program code into office document macros. There are several reasons why this technique is still in use two decades after it was first introduced:
- Macros are easy to hide.
- Macros are legitimate. Even if they are disabled by default, it is easy for the user to enable them.
- The sandboxing is more complex to emulate them.
- They are sent by email, so usually they are only analysed statically.
- The user does not think that a document or spreadsheet can be dangerous.
- It is still a very lucrative route for cyberattackers.
And even though so much time has gone by, innovation in this technique is still going on. The technique of stomping is a test. Firstly, let´s see what a “recent” macro consists of. We will find a binary file, with the extension .bin, inside the .zip file that nowadays are the documents. At least in the most recent versions of Office.
The first thing to bear in mind is that in this .bin file there are no macros as such, but a whole system ready to be compiled and executed by Office itself. Yes, it can be compared to any project carried out with Visual Studio, where we have the source code, the definitions, the compiled code… The Office system in use, such as Word or Excel, has an engine for compiling and executing this code.
In fact, within this .bin file, we can find the following (if we analyse it with the appropriate tools):
- PROJECT: flow (file): it is like the configuration file.
- VBA_PROJECT: flow with instructions for the VBA engine. Not documented.
- Dir: compressed and has the layout of the project.
- Module streams of the type VBA/ThisDocument/NewMacros/…/__SPR_1/Module1, which contains the code to be executed. Each module of the code is in turn composed of PerformanceCache and the CompressedSourceCode, which is the source code of the compressed macro.
What is all this for?
This pursues the obsessive backward compatibility of Microsoft. Let’s imagine that we create a document with macros in a recent version of Office, for example Word 2016. We create the macro and it is compiled into the system, but the source code is also stored with it. The person who receives the document may have an Office 2016, in which case, in order to go faster, the compiled macro will be executed directly. But what if you want to open the document with a Word 2003? Then, for compatibility, you must take the VBA source code of the macro, compile it in your engine and run it. And this is the reason why we find “clearly” the source code of the macros in the documents.
Historically, this has been an advantage for those who analyse this type of malware: they can access the code effortlessly and analyse it more easily, etc. Antiviruses have relied on this source code even to classify samples. However, someone thought that the document could still be infected if the compiled code was kept but the source code was deleted. And it was indeed. This technique of deleting the source code is VBA stomping, and allows malware to go unnoticed with little impact on its ability to infect. Only those users with unsupported or very old VBA engine versions (Office versions after all) would be spared from the infection.
The Evil Clippy tool already exists, capable of facilitating VBA stomping and automating all the necessary processes
As it can be seen, DIARIO already detects this type of documents and displays the code even if this technique has been used: