-
Notifications
You must be signed in to change notification settings - Fork 2.1k
context: adjust the file write logic to avoid corrupt context meta.json files #4042
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Haven't compared the implementations, but wondering if using cli/vendor/github.com/docker/docker/pkg/ioutils/fswriters.go Lines 29 to 31 in dfb36ea
|
|
@thaJeztah oh, that's perfect! ya, that code uses a similar approach. will switch to using that. |
Write to a tempfile then move, so that if the process dies mid-write it doesn't corrupt the store. Also improve error messaging so that if a file does get corrupted, the user has some hope of figuring out which file is broken. For background, see: docker/for-win#13180 docker/for-win#12561 For a repro case, see: https://github.com/nicks/contextstore-sandbox Signed-off-by: Nick Santos <nick.santos@docker.com>
79a968f to
c2487c2
Compare
|
worked like a charm! |
Ah, great! I don't think that implementation is the "end-all-be-all" canonical writer, but at least it could help finding locations where we consider an "atomic write" to be important in case we'd ever decide to improve it (or replace it with a more complete "atomic" writer - I know there's some projects (e.g. https://github.com/google/renameio). We should probably still create a tracking ticket to think about concurrent access, as none of this code was written with that in mind (and could be relevant if used as library code as well). |
thaJeztah
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
perhaps others could have a quick look as well on the (non) error wrapping and if there's any concerns about that.
| return Metadata{}, err | ||
| return Metadata{}, fmt.Errorf("parsing %s: %v", fileName, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes is masking the original error (%v). I don't THINK we use any of these errors as sentinel errors (had a cursory glance at call-sites for this code, and I think we only use the "not found" case as special), so we probably should be fine, but I'd 🤦 if I didn't comment, and regret it later.
I guess there's not much we can do in these cases, other than report "something was wrong; here's some info", so I guess we should be fine.
Only alternative would be to make this an errdefs.InvalidParam() (if we consider this "input is invalid", and to consistently return on of the errdefs types), but probably something we could still consider in a follow-up if we see a need.
|
re: atomic writes - i'm more worried about "this process got killed mid-write" then i'm worried about "two processes try to write at the same time". I know there's also been discussion about making Go's internal file locking library public - golang/go#33974 re: masking the original error - ya, I'm open to better patterns for this. mainly wanted to avoid users seeing a context-free json parse error. |
Agreed. There still may be other paths that write files (
Yes, having something like that would be good. I had to LOL on the pretty standard "why don't you use
I think it's fine as-is. Just (lessons learned) thought it was better to leave a (possibly fine to ignore) comment than to regret not commenting in hindsight. |
|
@vvoland @cpuguy83 @neersighted ptal (just a couple more eyes never huts) |
neersighted
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I also have no concerns about concurrent access. Clobbering will happen, but in a way that is much more likely to not break things than today.
|
Thanks all for the extra set of eyes! I'll bring this one in: I also marked it for cherry-picking, as I think this change should be fairly safe, and it may help resolve some of these issues. |
- What I did
context: adjust the file write logic to avoid corrupt context meta.json files
- How I did it
Write to a tempfile then move, so that if the
process dies mid-write it doesn't corrupt the store.
Also improve error messaging so that if a file does get corrupted, the user has some hope of figuring
out which file is broken.
For background, see:
docker/for-win#13180
docker/for-win#12561
I have other fixes in DD 4.17 that will help with these issues as well.
- How to verify it
I've been using this sandbox to repro the issue:
https://github.com/nicks/contextstore-sandbox
I verified that after these changes, the test.sh script no longer leaves files corrupt.
- Description for the changelog
context: adjust the file write logic to avoid corrupt context meta.json files
- A picture of a cute animal (not mandatory but encouraged)
