Have you ever tried to compile a helloworld Rust program in --release mode? If yes, have you seen its binary size? Suffice to say, it’s not exactly small. Or at least it wasn’t small until recently. This post details how I found about the issue and my attempt to fix it in Cargo.

Binary size analysis

I’m a member of the (relatively recently established) #wg-binary-size working group, which is trying to find opportunities to reduce the binary size footprint of Rust programs and libraries. Since I also maintain the Rust benchmark suite, my main task within the working group is to improve our tooling that measures and tracks the binary size of Rust programs.

As part of that effort, I recently added a new command to the benchmark suite, which allows examining and comparing the sizes of individual sections and symbols of a Rust binary (or a library) between two versions of the compiler.

The output of the command looks like this:

Output of the binary analysis command

While testing the command, I noticed something peculiar. When compiling the test binary in release mode (using --release), the analysis command showed that there are (DWARF) debug symbols in the binary. My immediate reaction was that I have a bug in my command and that I have to be compiling in debug mode by accident. Surely Cargo wouldn’t add debug symbols to my binary in release mode by default, right? Right?

My reaction to Cargo's behavior

Anakin/Padmé meme about Cargo and debug symbols


I spent maybe 15 minutes looking for the bug before I realized that there is no bug in my code. There are debug symbols in each Rust binary compiled in release mode by default, and this has been true for a long time. In fact, there is an old Cargo issue (almost 7-year-old, to be precise) that mentions this exact problem.

Why does this happen?

This is consequence of how the Rust standard library is distributed. When you compile a Rust crate, you don’t also compile the standard library1. It comes precompiled, typically using Rustup, in the rust-std component. To reduce download bandwidth2, it does not come in two variants (with and without debug symbols), but only in the more general variant with debug symbols.

On Linux (and also other platforms), the debug symbols are embedded directly in the object files of the library itself by default (instead of being distributed via separate files). Therefore, when you link to the standard library, you get these debug symbols “for free” also in your final binary, which causes binary size bloat.

This actually contradicts Cargo’s own documentation, which claims that if you use debug = 0 (which is the default for release builds), the resulting binary will not contain any debug symbols. But this is clearly not what happens now.

EDIT: Just to clarify, Cargo was putting the debuginfo of the Rust standard library into your program by default. It was not including the debuginfo of your own crate in release mode by default.

Why is it a problem?

If you take a look at the binary size of a Rust helloworld program compiled in release mode on Linux, you’ll notice that it has about ~4.3 MiB3. While it’s true that we have a lot more disk space today than in the past, that’s still abhorrently much.

Now, you might think that this is a non-issue, because anyone who wants to have smaller binaries simply strips them. That is a good point - in fact, after stripping the debug symbols from the mentioned helloworld binary4, its size is reduced to merely 415 KiB, only about 10% of the original size. However, the devil is in the details defaults.

And defaults matter! Rust advertises itself as a language that produces highly efficient and optimal code, but this impression isn’t really supported by a helloworld application taking more than 4 megabytes of space on disk. I can imagine a situation where a seasoned C or C++ programmer wants to try Rust, compiles a small program in release mode, notices the resulting binary size, and then immediately gives up on the language and goes to make fun of it on the forums.

Even though the issue goes away with just a single strip invocation, it is still a problem in my view. Rust tries to appeal to programmers coming from many different backgrounds, and not everyone knows that something like stripping binaries even exists. So it is important that we do a better job here, by default.

Note that the size of the libstd debug symbols is around 4 megabytes on Linux, and this size is constant, so even though for helloworld it takes ~90% of the size of the resulting binary, for larger programs its effect will be smaller. But still, four megabytes is nothing to sneeze at, since it is included in every Rust binary built everywhere by default.

Proposing a change to Cargo

After I have realized that this is the default behavior of Cargo, I have actually remembered that I have just rediscovered this exact issue perhaps for the third time already. I have just never really acted upon it before and then always managed to forget about it.

This time, I was determined to do something about it. But where to start? Well, usually it’s not a bad idea to just ask around on the Rust Zulip, so I did exactly that. It turns out that I wasn’t the first person to ask that very same question, and that it came up multiple times over the years. The proposed solution was to simply strip debug symbols from Rust programs in release mode by default, which would remove the binary size bloat problem. In the past, this used to be blocked by the stabilization of strip support in Cargo, but that has actually already happened back at the beginning of 2022.

So, why wasn’t this proposal implemented yet? Were there any big concerns or blockers? Well, not really. When I have asked around on Zulip, pretty much everyone thought that it would be a good idea. And while there were some earlier attempts to do this, they haven’t been pushed through.

So, to sum up, it hasn’t been done yet because no one had done it yet :) So I set out to fix that. To test if stripping by default could work, I created a PR to the compiler and started a perf benchmark. The binary size results (for tiny crates) looked pretty good, so that gave me hope that the approach of stripping by default could indeed work.

Funnily enough, this change also made compilation time of tiny crates (like helloworld) up to 2x faster on Linux! How could that be, when we’re doing more work, by including stripping in the compilation process? Well, it turns out that the default Linux linker (bfd) is brutally slow5, so by removing the debug symbols from the final binary, we actually reduce the amount of work the linker needs to perform, which makes compilation faster. Sadly, this has an observable effect only for really tiny crates.

There is an ongoing effort to use a faster linker (lld) by default on Linux (again, defaults matter :smile:). Stay tuned!

After showing these results to the Cargo maintainers, they have asked me to write down a proposal on the original Cargo issue. In this mini-proposal, I have explained what change to the Cargo defaults I want to make, how it could be implemented, and what are other considerations of the change.

For example, one thing that was noted is that if we strip the debug symbols by default, then backtraces of release builds will… not contain any debug info, such as line numbers. That is indeed true, but my claim is that these have not been useful anyway. If you have a binary that only has debug symbols for the standard library, but not for your own code, then even though the backtrace will contain some line numbers from stdlib, it will not really give you any useful context (you can compare the difference here). There were also some implementation considerations, for example how to handle situations where only some of your target’s dependencies request debug symbols. You can find more details in the proposal.

After I wrote the proposal, it went through the FCP process. The Cargo team members voted on it, and once it was accepted after a 10-day waiting period designed for any last concerns (the FCP), I could implement the proposal, which was actually surprisingly straightforward.

The PR has been merged a week ago, and it is now in nightly! :tada:

The TLDR of the change is that Cargo will now by default use strip = "debuginfo" for the release profile (unless you explicitly request debuginfo for some dependency):

[profile.release]
# v This is now used by default, if not provided
strip = "debuginfo"

In fact, this new default will be used for any profile which does not enable debuginfo anywhere in its dependency chain, not just for the release profile.

There was one unresolved concern about using strip on macOS, because it seems that there can be some issues with it. The change has been in nightly for a week and I haven’t seen any problems, but if this will cause any issues, we can also perform the debug symbols stripping selectively, only on some platforms (e.g. Linux and Windows). Let us know if you find any issues with stripping using Cargo on macOS!

Conclusion

In the end, this was yet another case of “if you want it done, just do it”, which is common in open-source projects :) I’m glad the change went through, and I hope that we won’t find any major issues with it and that it will be stabilized in the upcoming months.

If you have any comments or questions, please let me know on Reddit.

  1. Unless you use build-std, which is however sadly still unstable. 

  2. While at the same time, ironically, increasing the size of Rust binaries on disk. 

  3. Tested with rustc 1.75.0

  4. For example with strip --strip-debug <binary>

  5. Although recently, there have been an effort to speed it up.