Click the code images below to zoom, or click their captions for plain text.
Memory safety is a hot topic these days. (For example, it’s one of Carbon’s primary goals.) While everyone seems to agree that memory safety is really important, what actually constitutes safety remains subjective. This week, we’ll peek at semantically equivalent Rust, C++, and Go code to see how their approaches to safety differ.
First, consider a simple append() function that concatenates two lists of integers:
This is pretty straightforward, and doesn’t pose a challenge for any of our three languages. Things get more interesting if we define a function that takes only the suffix parameter, but returns a brand new function (a closure) that can append the suffix to any given list of items:
The Rust make_appender’s return type includes the lifetime '_
, expressing an important property of the function. It means, basically: “You passed me a pointer to something. The value I return is only good as long as that object is still alive.” It’s like an expiration date for in-memory objects: If the original argument is destroyed, then the function’s return value expires, and you can’t use it anymore. For example, let’s try calling the Rust closure after the original suffix object goes out of scope:
Rust won’t let pointers dangle as C++ would, nor will it have a Garbage Collector (GC) keep the original object’s memory on life support. (It’s common for GCs to keep memory alive even after you destroy/close/dispose an object, even though the object is useless, just so extant pointers to it don’t dangle.)
C++ does not handle this situation well:
The suffix object {3, 4} in our C++ test is implicitly destroyed before the assertion even executes. Why doesn’t it survive to the end of the function? Just a weirdness of C++. Once the suffix object is destroyed, the closure (append34) is broken. It has a pointer to the memory where the suffix object used to live, but the object doesn’t live there anymore. The compiler doesn’t catch this, even with the warnings cranked up. Insead, we get undefined behavior (UB) at run time. We’re lucky the test happened to fail; sometimes, UB manages to corrupt memory (making a program do bad things) without failing any tests.
To make C++ do the right thing, we have to move the suffix into the closure:
Go fares much better. Not only does it compile, it actually works as intended:
Preventing dangling pointers and undefined behavior is what most people mean when they talk about memory safety, but there’s more to the story. As you may have heard, shared mutable state is evil. Rust’s major innovation is to guarantee that mutable state is never shared, and shared state is never mutated.
For example, suppose that after calling make_appender, we mutate the suffix object. Should subsequent calls to the returned closure use the original value, or the new one?
Rust does the only sane thing it can in this situation: It refuses to compile the code. Guess what C++ does?
You guessed it: undefined behavior! Again, the C++ compiler is of no help. Instead of a failed test, this time we managed a segfault.
Go compiles and runs, and lets the mutation proceed, such that modifying the suffix object becomes a backdoor way to change the behavior of the closure:
We could debate whether Go’s behavior makes sense, but allowing this kind of spooky action at a distance opens the door to a host of heinous bugs. This problem isn’t specific to Go in particular; in fact, Go is a great example of a modern garbage-collected language. Ownership is rarely clear in GC languages, because really, everything is sort of part-owned by the garbage collector. You may think your object owns its constituent parts, but the GC has a lien against them. GC languages could, in principle, have move semantics and other niceties; but in practice, they do not.
We’ve covered a lot of ground in this post, but there’s a great deal more to discuss, such as how object lifetimes, shared state, and mutation interact with concurrency and parallellism. Please leave a comment if you feel we should (or should not!) dig deeper into this topic, or if Rust, C++, and Go aren’t the languages you’d most like to see in future posts.
"This post presents a lot of code as screenshots."
Does substack not provide a way to include fixed with code blocks at all? If the only limitation is not supporting syntax highlighting I'd probably give it up to be more accessible.
Cppcheck already warns for this code:
test.cpp:16:45: error: Using object that is a temporary. [danglingTemporaryLifetime]
assert((std::vector<int>{1, 2, 3, 4} == append34({1, 2}))); // FAIL: UB
^
test.cpp:3:12: note: Return lambda.
return [&](std::vector<int>&& items) {
^
test.cpp:2:50: note: Passed to reference.
auto make_appender(std::vector<int> const& suffix) {
^
test.cpp:4:36: note: Lambda captures variable by reference here.
return append(move(items), suffix);
^
test.cpp:15:35: note: Passed to 'make_appender'.
auto append34 = make_appender({3, 4});
^
test.cpp:15:35: note: Temporary created here.
auto append34 = make_appender({3, 4});
^
test.cpp:16:45: note: Using object that is a temporary.
assert((std::vector<int>{1, 2, 3, 4} == append34({1, 2}))); // FAIL: UB