Skip to content

Conversation

@rjmansfield
Copy link
Contributor

Adds two features for better dSYM support on macOS:

  1. Add Mach-O specific --dsym option

    • Accepts both flat dSYM companion files and dSYM bundles
    • Handles multiple dSYM files for multiple input binaries
  2. Automatic dSYM discovery

    • Searches for .dSYM/Contents/Resources/DWARF/
    • Only loads dSYM if build IDs match to prevent mismatches

This improves the macOS user experience by eliminating the need to manually specify the binary file inside of the dSYM bundle.

@rjmansfield rjmansfield force-pushed the macos-dsym-support branch 2 times, most recently from 99d67fc to 5b43246 Compare October 30, 2025 16:54
@rjmansfield
Copy link
Contributor Author

First patch failed on Windows with the implicit conversion:

D:\a\bloaty\bloaty\bloaty\src\bloaty.cc(1637,9): error C2664: 'void bloaty::Bloaty::AddDebugFilename(const std::string &)': cannot convert argument 1 from 'std::filesystem::path' to 'const std::string &' [D:\a\bloaty\bloaty\build\bloaty\libbloaty.vcxproj]

Updated patch should resolve that issue.

Copy link
Collaborator

@EricRahm EricRahm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rjmansfield Took a quick look and decided I need to brush up on dSYM before going too deep, so consider this a very limited first pass.

I made some super minor comments around logging -- my general feedback is maybe remove some of it, particularly for expected failures where we're speculatively looking for debug info.

src/bloaty.cc Outdated
// Automatically discover and load dSYM file for a Mach-O binary
void Bloaty::TryAutoLoadDsym(const std::string& binary_path, const std::string& build_id) {
if (verbose_level > 0) {
fprintf(stderr, "TryAutoLoadDsym called for: %s (build_id: %s)\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we tend to use > 1 for info type messages, and don't print to stderr. Or we could remove this.

src/bloaty.cc Outdated
}
} catch (const std::exception& e) {
if (verbose_level > 0) {
fprintf(stderr, "Failed to read dSYM file %s: %s\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ideally we'd structure this so there wasn't an exception. Either way maybe we don't need a log. It seems reasonable and maybe expected that we don't always find a dsym file right?

src/bloaty.cc Outdated
}
}
if (!found_any) {
fprintf(stderr,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We can use WARN here.

@EricRahm
Copy link
Collaborator

Just a heads up, I'll be away for a week or so but should be able to take a look next week!

@EricRahm
Copy link
Collaborator

EricRahm commented Dec 6, 2025

@haberman mind taking a look, this is improving dsym support. It's adding a new flag --dsym

doc/using.md Outdated
You can also explicitly specify a dSYM bundle:

```
$ ./bloaty -d symbols --dsym=bloaty.dSYM bloaty
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example is a little confusing because, if I'm understanding right, it tells Bloaty to load a bundle that it would have auto-loaded even without the flag.

Should this example use a different dSYM filename? Would the main use case be loading a dsym from a different directory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will change it to a non-autoloaded path in the example and further explain.

Ideally the high runner case is that user will use clang as linker driver with -g and then clang will spawn dsymutil and write the dSYM in the same directory as the binary and then debug information will be seamlessly loaded. The explicit option --dsym option would be only for when dsymutil gets invoked directly and written to different directory than the executable where autoloading doesn't find them.

src/bloaty.cc Outdated
std::filesystem::path dsym_filename(
dsym_filepath / std::filesystem::path(binary_filename).filename());
if (std::filesystem::exists(dsym_filename)) {
AddDebugFilename(dsym_filename.string());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be checking the build ID here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point. Will add.

}

// Automatically discover and load dSYM file for a Mach-O binary
void Bloaty::TryAutoLoadDsym(const std::string& binary_path,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the user docs, I would expect that auto-loading of a dSYM performs exactly the same logic as explicitly specifying --dsym, but that doesn't seem to be the case.

Is there a reason to have two distinct functions for this? If so, can we add more explanation in the user docs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have slightly different semantics and have different error handling.

For auto loading, I think lookup should be silent best effort but not throw/error if it fails. Where if a user specifies explicitly a --dsym and it's invalid or missing, that it should be reported to the user e.g. AddDsymFilename will throw if if file doesn't exist or if not a dSYM bundle directory.

I can explain in the documentation, but I think there's probably a little bit of refactoring that I can do here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the diff which I hope explains the differences between the two functions in terms of behaviour. i.e. autoloading is silent best effort, whereas the explicit --dsym option loading is validated more strictly. Please let me know you have any feedback on it.

This is a bit of a sale pitch to add the option but I think this patch, specifically the autoloading part, will definitely improve the out of the box usability for users with mach-o binaries. I hit both of these scenarios when first using bloaty.

With patch:

$ clang hw.c -g
$ bloaty -d compileunits ~/a.out
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  48.7%  16.0Ki  33.3%  16.0Ki    [__DATA_CONST]
   2.6%     866  33.3%  16.0Ki    [__LINKEDIT]
  45.0%  14.8Ki  31.0%  14.9Ki    [__TEXT]
   3.2%  1.06Ki   2.0%     984    [Mach-O Headers]
   0.3%      88   0.2%      88    [__TEXT,__unwind_info]
   0.2%      82   0.2%      82    hw.c
   0.0%       8   0.0%       8    [__DATA_CONST,__got]
 100.0%  32.9Ki 100.0%  48.0Ki    TOTAL

vs

$ bloaty -d compileunits ~/a.out
bloaty: missing debug info

vs manually specifying the full path into the dSYM companion file

$ bloaty -d compileunits ~/a.out  --debug-file ~/a.out.dSYM/Contents/Resources/DWARF/a.out
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  48.7%  16.0Ki  33.3%  16.0Ki    [__DATA_CONST]
   2.6%     866  33.3%  16.0Ki    [__LINKEDIT]
  45.0%  14.8Ki  31.0%  14.9Ki    [__TEXT]
   3.2%  1.06Ki   2.0%     984    [Mach-O Headers]
   0.3%      88   0.2%      88    [__TEXT,__unwind_info]
   0.2%      82   0.2%      82    hw.c
   0.0%       8   0.0%       8    [__DATA_CONST,__got]
 100.0%  32.9Ki 100.0%  48.0Ki    TOTAL

Adds two features for better dSYM support on macOS:

1. Add Mach-O specific --dsym option
   - Accepts dSYM bundles
   - Handles multiple dSYM files for multiple input binaries

2. Automatic dSYM discovery
   - Searches for <binary>.dSYM/Contents/Resources/DWARF/<binary>
   - Only loads dSYM if build IDs match to prevent mismatches

This improves the macOS user experience by eliminating the need to
manually specify the binary file inside of the dSYM bundle.

CMakeLists.txt: Add bloaty.pb.h to protoc OUTPUT list. Protoc generates
both .cc and .h files, so both should be declared as outputs for proper
dependency tracking.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants