Safety
This post presents a lot of code as screenshots. The complete code is available for download at github.com/jeffs/nested/safety. Leave a comment if screenshots make you đ or đĄ.
Memory safety is a hot topic these days. (For example, itâs one of Carbonâs primary goals.) While everyone seems to agree that memory safety is really important, what actually constitutes safety remains subjective. This week, weâll peek at semantically equivalent Rust, C++, and Go code to see how their approaches to safety differ.
First, consider a simple append() function that concatenates two lists of integers:
This is pretty straightforward, and doesnât pose a challenge for any of our three languages. Things get more interesting if we define a function that takes only the suffix parameter, but returns a brand new function (a closure) that can append the suffix to any given list of items:
The Rust make_appenderâs return type includes the lifetime '_
, expressing an important property of the function. It means, basically: âYou passed me a pointer to something. The value I return is only good as long as that object is still alive.â Itâs like an expiration date for in-memory objects: If the original argument is destroyed, then the functionâs return value expires, and you canât use it anymore. For example, letâs try calling the Rust closure after the original suffix object goes out of scope:
Rust wonât let pointers dangle as C++ would, nor will it have a Garbage Collector (GC) keep the original objectâs memory on life support. (Itâs common for GCs to keep memory alive even after you destroy/close/dispose an object, even though the object is useless, just so extant pointers to it donât dangle.)
C++ does not handle this situation well:
The suffix object {3, 4} in our C++ test is implicitly destroyed before the assertion even executes. Why doesnât it survive to the end of the function? Just a weirdness of C++. Once the suffix object is destroyed, the closure (append34) is broken. It has a pointer to the memory where the suffix object used to live, but the object doesnât live there anymore. The compiler doesnât catch this, even with the warnings cranked up. Insead, we get undefined behavior (UB) at run time. Weâre lucky the test happened to fail; sometimes, UB manages to corrupt memory (making a program do bad things) without failing any tests.
To make C++ do the right thing, we have to move the suffix into the closure:
Go fares much better. Not only does it compile, it actually works as intended:
Preventing dangling pointers and undefined behavior is what most people mean when they talk about memory safety, but thereâs more to the story. As you may have heard, shared mutable state is evil. Rustâs major innovation is to guarantee that mutable state is never shared, and shared state is never mutated.
For example, suppose that after calling make_appender, we mutate the suffix object. Should subsequent calls to the returned closure use the original value, or the new one?
Rust does the only sane thing it can in this situation: It refuses to compile the code. Guess what C++ does?
You guessed it: undefined behavior! Again, the C++ compiler is of no help. Instead of a failed test, this time we managed a segfault.
Go compiles and runs, and lets the mutation proceed, such that modifying the suffix object becomes a backdoor way to change the behavior of the closure:
We could debate whether Goâs behavior makes sense, but allowing this kind of spooky action at a distance opens the door to a host of heinous bugs. This problem isnât specific to Go in particular; in fact, Go is a great example of a modern garbage-collected language. Ownership is rarely clear in GC languages, because really, everything is sort of part-owned by the garbage collector. You may think your object owns its constituent parts, but the GC has a lien against them. GC languages could, in principle, have move semantics and other niceties; but in practice, they do not.
Weâve covered a lot of ground in this post, but thereâs a great deal more to discuss, such as how object lifetimes, shared state, and mutation interact with concurrency and parallellism. Please leave a comment if you feel we should (or should not!) dig deeper into this topic, or if Rust, C++, and Go arenât the languages youâd most like to see in future posts.
"This post presents a lot of code as screenshots."
Does substack not provide a way to include fixed with code blocks at all? If the only limitation is not supporting syntax highlighting I'd probably give it up to be more accessible.
Cppcheck already warns for this code:
test.cpp:16:45: error: Using object that is a temporary. [danglingTemporaryLifetime]
assert((std::vector<int>{1, 2, 3, 4} == append34({1, 2}))); // FAIL: UB
^
test.cpp:3:12: note: Return lambda.
return [&](std::vector<int>&& items) {
^
test.cpp:2:50: note: Passed to reference.
auto make_appender(std::vector<int> const& suffix) {
^
test.cpp:4:36: note: Lambda captures variable by reference here.
return append(move(items), suffix);
^
test.cpp:15:35: note: Passed to 'make_appender'.
auto append34 = make_appender({3, 4});
^
test.cpp:15:35: note: Temporary created here.
auto append34 = make_appender({3, 4});
^
test.cpp:16:45: note: Using object that is a temporary.
assert((std::vector<int>{1, 2, 3, 4} == append34({1, 2}))); // FAIL: UB