Will Rust save the world? (I)

David García    4 May, 2023

The issue

Programming bugs related to memory usage take a big slice of the CVE (Common Vulnerabilities and Exposures) pie every year.

It has been many years since Elias Levi’s (then known as Aleph One) classic and historic article, Smashing the Stack for Fun and Profit put in black and white how stack overflows could be exploited in an era where available information was scarce and not exactly easy to access.

It was soon realised that memory management required attention and expertise that not all programmers possessed. When you create an object using dynamic memory (malloc), you have to keep track of its “lifetime” and free it when it is no longer useful so that the memory resources it owns can be reused.

Memory management requires attention and expertise that not all programmers possess.

This ecology is simple to set up, but devilishly complex to put into practice when a program exceeds the adjective of simple. It is easy to understand: when an object is created, it does not live in isolation from the rest of the program, it tends to create numerous references (pointers) and establish relationships (sometimes multi-layered) that finally lead us to ask ourselves the question:

  • What if we release the object, will there be a reference that still requires the use of that object?
  • What happens if we use an object that has already been released?
  • What happens if I mistakenly release the same object twice?

Rubbish collectors: a solution with drawbacks

In the mid-1990s a programming language tried, successfully, to solve the problem with software created by languages with manual memory management: mainly C and C++.

Unsurprisingly, it was Java. The language of the now defunct Sun Microsystems company (curious note: remember that there is an IDE for Java called… Eclipse), combined the incipient rise of object orientation and added concepts that were not new, but were well executed: a virtual machine and a memory collector, among others.

Photo: Kelly Sikkema / Unsplash

Java was a leap in quality and quantity. Programmers could (almost) abstract from memory management and were more productive. In addition, because Java was easier to learn (or rather, to master) than C and C++, it reduced the entry level for new programmers, which made it more affordable for companies, which in turn were able to complete more complex projects in a shorter period of time.

Go language, also known as Golang, born precisely out of frustration with the use of C++ and… Java.

Even today, other programming languages have emulated this initiative. The clearest example is the language Go, also known as Golang. Born precisely out of frustration with the use of C++ and… Java.

The use of memory collectors in both of them erases the problem created by manual management, something that promotes the simplicity of the language and, once again, lowers the barrier to entry for new programmers.

There is only one drawback, and it is not one that can be overlooked. Automatic memory management has a performance cost.

The biggest drawback

Although new memory management algorithms are increasingly advanced and the development capabilities of techniques allow nanoseconds of performance to be shaved off, the use of memory rubbish collectors has a toll to take during the collection of “rubbish” (unreferenced or unused memory) the world grinds to a halt.

Indeed, when it is time to collect memory to be recycled, the program stops what it is doing, calls a garbage collector procedure, and routinely frees memory. Whatever the program is doing at that moment, its world stops for a fraction of a second, and it goes about cleaning up memory. Once it is done, it goes back to what it was doing as if nothing had happened.

This Stop the world for many companies, on a large enough scale, translates into an added cost to the server bill.

There are dozens of articles, almost all of which list the same reasons or complaints: the cost of the rubbish collector.

Source: https://discord.com/blog/why-discord-is-switching-from-go-to-rust

It may seem miniscule, but the response spikes accumulated in large, high-demand applications translate into an economic bill at the end of the chain. No matter how well you program, no matter how many algorithms and data structures you use, the world stops to collect rubbish and don’t come back until the memory is clean as a whistle.

The dilemma

So,

  • On the one hand, we have the languages that have great performance and exceptional execution, but which require suitably qualified personnel (and even so…) and tools that mitigate management errors.
  • On the other hand, there are languages that increase productivity, drastically reduce memory management errors (although let’s not forget about NullPointerException), but on the other hand, they have a little hindrance that makes them a bigger swallower of computational resources.

It is a real dilemma in some cases, although it also becomes evident depending on the nature of the project (you are not going to write a Linux kernel module in Perl, nor are you going to implement a web application in C, as was done in the 90s).

Barring the extremes, the options are to think about and calibrate what we sacrifice and what advantages are more appropriate.

The third way

However, what if there was a middle ground? What if we could find a way to get rid of rubbish collector pauses, but still not have to manage memory manually?

How do we do that? Let’s see it in next article.

Featured photo: Mark Fletcher Brown / Unsplash.