-
Notifications
You must be signed in to change notification settings - Fork 73
RFC: add Phase Debug Script #330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Maintainers, As you review this RFC please queue up issues to be created using the following commands: Issues(none) |
|
I love this idea. This would greatly help in debugging all kinds of issues. |
| - What is the impact of not doing this? | ||
| - harder to understand builds, harder to debug permission issues... etc | ||
|
|
||
| # Prior Art |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pack build --interactive aims to shed some light on the inner workings of the build. But it is not as flexible as the current proposition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL
$ pack build my-app-name --interactive
ERROR: Interactive mode is currently experimental.
Tip: To enable experimental features, run `pack config experimental true` to add experimental = true to /Users/rschneeman/.pack/config.toml.
A thing I've always wanted: When the build fails, let me drop into a shell at that point. I.e. if the build fails while running bundle install and I've built with pack build myimage --onfail=myimage-failed then I could docker run ... myimage-failed bash and ls -la and gem env etc. to try to understand why it failed. I would want it to give me the build image and tools as if I was in the next buildpack to be executed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's interesting, but substantially beyond the scope of what I'm proposing for this RFC.. you could possibly do something evil with a debug script to check the rc of the lifecycle phase, and initiate a reverse ssh connection to a development ip address on failure.. but you won't be in the context of the failing buildpack, but the failing lifecycle phase, which may not give you what you seek.
| [motivation]: #motivation | ||
|
|
||
| - Why should we do this? | ||
| - Debugging state within a builder can be tricky, it can be beneficial during buildpack and platform development to understand the content of buildpack toml files, and filesystem permissions, and even know sha hashes for builder packaged content. While it is possible to test some things via a custom debug buildpack, buildpacks are not involved in every phase of execution, and some changes occur outside of their involvement. Being able to trace those state changes and verify assumptions is beneficial to buildpack and platform developers alike. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
buildpacks are not involved in every phase of execution, and some changes occur outside of their involvement. Being able to trace those state changes and verify assumptions is beneficial to buildpack and platform developers alike.
Can you give some more examples here? You mention this was helpful when debugging. Can you point to specific problems you hit introduced non-buildpacks? I can think of things like the base image and buildpack order groups possibly affecting the outcome, what else am I missing and how can they affect/impact the build?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When implementing the platform, one of the most common issues was volumes ending up mounted as the wrong user/group, which inevitably leads to odd behavior when the lifecycle discovers it has no permission to write, delete, or create files (but can modify existing).. I had to debug that across docker/podman on multiple operating systems, and runtime versions before I was sure I had it right..
Later investigating multi-arch image issues, I found it also handy to validate the content of the analyzed.toml before/after each phase, to identify which step was rewriting it with an incorrect image reference.
I've also used it to validate build plan updates, and verify permissions/ownership for buildpacks/extensions under development, even as a hacky way to tweak a bad dev image into a workable one by having the script fix up minor issues..
It's a gigantic swiss army knife of possibilities.. used responsibly it's pretty handy as a platform author =) (and having pack have the same sort of ability would give people the opportunity to see what the reference platform implementation does, when looking at 'why does pack work when my platform doesn't' .. something that's doable today, but via messier custom build images.. I've used Dockerfiles to extend a builder image, moving the lifecycle to a subdir, adding in a script to do what I want, and invoke the lifecycle etc in much the same way this proposes, but then I have to do that for every builder I want to test with, and redo it when I update to newer builders etc.. having the platform take care of that made it all a lot simpler)
| # Prior Art | ||
| [prior-art]: #prior-art | ||
|
|
||
| Suggesting here, as I implemented this in the snowdrop platform implementation https://github.com/snowdrop/java-buildpack-client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I searched for "debug" on that page and it's mostly logging. Is there a better url/reference to read? Any docs for the feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The snowdrop platform is a Java library, and the debug script is supplied as part of the platform config.. Probably easiest to see an example.. https://github.com/snowdrop/java-buildpack-client/blob/main/samples/hello-quarkus/pack.java
In my case I'm accepting a string holding the script, mostly to avoid issues caring about reading a file from disk to pass in.. it's an implementation detail that's relatively unimportant, and pack could easily implement it whichever way is easiest.
The idea was useful enough I thought it worth suggesting for pack too.
| # Drawbacks | ||
| [drawbacks]: #drawbacks | ||
|
|
||
| - Adds clutter to an already complex set of commandline options for pack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would an env variable be less clutter? You could set something like CNB_DEBUG_SCRIPT=... and if set, it would invoke this behavior. You still have to document it, but then pack build doesn't need another arg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Env var works too.. although I'm not sure how many env vars are used to control pack behavior. Otomh, maybe DOCKER_HOST ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong feelings about one over the other. I agree that arguments would be more consistent with everything else in pack build.
| ls -lar /layers | ||
| ``` | ||
|
|
||
| This dumps the content of the /layers dir, invokes the orignal command and then dumps it again.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, this could modify things in the env too, right? Not within the spirit of what's being proposed, but does that create any issues for anyone or security concerns?
Maybe the lifecycle should annotate or label the produced image that debug was enabled? That way someone could audit or implement a policy where no images with "debug" enabled are allowed in prod, or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A very good point.. I'm thinking of this from a debugging/development aid, but it would make a lot of sense to ensure images built have a marker of some kind, just for audit.
Yes, it could modify the env, but thinking bigger, it could just bypass the phase entirely and do "something else". There's no real guarantees here..
Nitpick though, it wouldn't be lifecycle doing the annotate/label, but the platform impl, it's outside of the lifecycles control, as the debug script is in control of if the lifecycle even runs. (RFC is to enhance pack)
Suggestion to add ability to drive a debug script instead of the lifecycle binaries during a build, for debug purposes.
Link for reading : https://github.com/BarDweller/buildpacks-rfcs/blob/main/text/0000-pack-phase-debug-script.md