-
Notifications
You must be signed in to change notification settings - Fork 367
Improve dSYM loading for Mach-O binaries #425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
99d67fc to
5b43246
Compare
|
First patch failed on Windows with the implicit conversion:
Updated patch should resolve that issue. |
EricRahm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rjmansfield Took a quick look and decided I need to brush up on dSYM before going too deep, so consider this a very limited first pass.
I made some super minor comments around logging -- my general feedback is maybe remove some of it, particularly for expected failures where we're speculatively looking for debug info.
src/bloaty.cc
Outdated
| // Automatically discover and load dSYM file for a Mach-O binary | ||
| void Bloaty::TryAutoLoadDsym(const std::string& binary_path, const std::string& build_id) { | ||
| if (verbose_level > 0) { | ||
| fprintf(stderr, "TryAutoLoadDsym called for: %s (build_id: %s)\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we tend to use > 1 for info type messages, and don't print to stderr. Or we could remove this.
src/bloaty.cc
Outdated
| } | ||
| } catch (const std::exception& e) { | ||
| if (verbose_level > 0) { | ||
| fprintf(stderr, "Failed to read dSYM file %s: %s\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ideally we'd structure this so there wasn't an exception. Either way maybe we don't need a log. It seems reasonable and maybe expected that we don't always find a dsym file right?
src/bloaty.cc
Outdated
| } | ||
| } | ||
| if (!found_any) { | ||
| fprintf(stderr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We can use WARN here.
5b43246 to
dc4b1ab
Compare
dc4b1ab to
394d5dd
Compare
|
Just a heads up, I'll be away for a week or so but should be able to take a look next week! |
394d5dd to
e46fe86
Compare
|
@haberman mind taking a look, this is improving dsym support. It's adding a new flag |
doc/using.md
Outdated
| You can also explicitly specify a dSYM bundle: | ||
|
|
||
| ``` | ||
| $ ./bloaty -d symbols --dsym=bloaty.dSYM bloaty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example is a little confusing because, if I'm understanding right, it tells Bloaty to load a bundle that it would have auto-loaded even without the flag.
Should this example use a different dSYM filename? Would the main use case be loading a dsym from a different directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will change it to a non-autoloaded path in the example and further explain.
Ideally the high runner case is that user will use clang as linker driver with -g and then clang will spawn dsymutil and write the dSYM in the same directory as the binary and then debug information will be seamlessly loaded. The explicit option --dsym option would be only for when dsymutil gets invoked directly and written to different directory than the executable where autoloading doesn't find them.
src/bloaty.cc
Outdated
| std::filesystem::path dsym_filename( | ||
| dsym_filepath / std::filesystem::path(binary_filename).filename()); | ||
| if (std::filesystem::exists(dsym_filename)) { | ||
| AddDebugFilename(dsym_filename.string()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be checking the build ID here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point. Will add.
| } | ||
|
|
||
| // Automatically discover and load dSYM file for a Mach-O binary | ||
| void Bloaty::TryAutoLoadDsym(const std::string& binary_path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the user docs, I would expect that auto-loading of a dSYM performs exactly the same logic as explicitly specifying --dsym, but that doesn't seem to be the case.
Is there a reason to have two distinct functions for this? If so, can we add more explanation in the user docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They have slightly different semantics and have different error handling.
For auto loading, I think lookup should be silent best effort but not throw/error if it fails. Where if a user specifies explicitly a --dsym and it's invalid or missing, that it should be reported to the user e.g. AddDsymFilename will throw if if file doesn't exist or if not a dSYM bundle directory.
I can explain in the documentation, but I think there's probably a little bit of refactoring that I can do here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the diff which I hope explains the differences between the two functions in terms of behaviour. i.e. autoloading is silent best effort, whereas the explicit --dsym option loading is validated more strictly. Please let me know you have any feedback on it.
This is a bit of a sale pitch to add the option but I think this patch, specifically the autoloading part, will definitely improve the out of the box usability for users with mach-o binaries. I hit both of these scenarios when first using bloaty.
With patch:
$ clang hw.c -g
$ bloaty -d compileunits ~/a.out
FILE SIZE VM SIZE
-------------- --------------
48.7% 16.0Ki 33.3% 16.0Ki [__DATA_CONST]
2.6% 866 33.3% 16.0Ki [__LINKEDIT]
45.0% 14.8Ki 31.0% 14.9Ki [__TEXT]
3.2% 1.06Ki 2.0% 984 [Mach-O Headers]
0.3% 88 0.2% 88 [__TEXT,__unwind_info]
0.2% 82 0.2% 82 hw.c
0.0% 8 0.0% 8 [__DATA_CONST,__got]
100.0% 32.9Ki 100.0% 48.0Ki TOTAL
vs
$ bloaty -d compileunits ~/a.out
bloaty: missing debug info
vs manually specifying the full path into the dSYM companion file
$ bloaty -d compileunits ~/a.out --debug-file ~/a.out.dSYM/Contents/Resources/DWARF/a.out
FILE SIZE VM SIZE
-------------- --------------
48.7% 16.0Ki 33.3% 16.0Ki [__DATA_CONST]
2.6% 866 33.3% 16.0Ki [__LINKEDIT]
45.0% 14.8Ki 31.0% 14.9Ki [__TEXT]
3.2% 1.06Ki 2.0% 984 [Mach-O Headers]
0.3% 88 0.2% 88 [__TEXT,__unwind_info]
0.2% 82 0.2% 82 hw.c
0.0% 8 0.0% 8 [__DATA_CONST,__got]
100.0% 32.9Ki 100.0% 48.0Ki TOTAL
Adds two features for better dSYM support on macOS: 1. Add Mach-O specific --dsym option - Accepts dSYM bundles - Handles multiple dSYM files for multiple input binaries 2. Automatic dSYM discovery - Searches for <binary>.dSYM/Contents/Resources/DWARF/<binary> - Only loads dSYM if build IDs match to prevent mismatches This improves the macOS user experience by eliminating the need to manually specify the binary file inside of the dSYM bundle. CMakeLists.txt: Add bloaty.pb.h to protoc OUTPUT list. Protoc generates both .cc and .h files, so both should be declared as outputs for proper dependency tracking.
e46fe86 to
f10207f
Compare
Adds two features for better dSYM support on macOS:
Add Mach-O specific --dsym option
Automatic dSYM discovery
This improves the macOS user experience by eliminating the need to manually specify the binary file inside of the dSYM bundle.