-
Notifications
You must be signed in to change notification settings - Fork 139
Implement proposed k8s-stream-file format
#265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
See also #262. |
|
This pull request introduces 1 alert when merging c0bc805 into 3161452 - view on LGTM.com new alerts:
|
c0bc805 to
0bd2c0a
Compare
haircommander
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the implementation @portante ! I have some fiirst-pass review comments, but if this is just a POC feel free to ignore them until the concept is proven.
you'll need to run make fmt and commit the changes to have the majority of CI checks run.
I am betting this will perpetually fail kubernetes e2e tests but I am curious to see what actually ends up happening.
src/ctr_logging.c
Outdated
| /* | ||
| * PROPOSED: CRI Stream Format, variable length file format | ||
| */ | ||
| static int set_k8s_stream_timestamp(char *buf, ssize_t bufsiz, ssize_t *tsbuflen, const char *pipename, uint64_t offset, ssize_t buflen, ssize_t *btbw) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there seem to be some similarities between this and set_k8s_timestamp. can we either make the shared functionality a function or document why that's not possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not want to disturb any existing code with this to avoid introducing un-intended bugs.
If we decide that it is worth merging this, then we should consider combing those two methods into one.
0bd2c0a to
3257f6a
Compare
Fixed. Thanks for the review! |
|
@giuseppe PTAL |
| off = -off; | ||
| } | ||
|
|
||
| len = snprintf(buf, bufsiz, "%d-%02d-%02dT%02d:%02d:%02d.%09ld%c%02d:%02d %s %lud %ld ", current_tm.tm_year + 1900, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(A drive-by comment with little context:)
If the goal is to offload the CPU processing to consumers, shouldn’t this just be a timestamp instead of all the timezone lookups and formatting?
Either way, UTC would be better than ambiguous local time — does this one need custom parser code to get a struct timespec back?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here is establishing a reference point for what the timestamp "means".
First, for efficiency, a simple monotonically incremental timestamp is all that is needed, along with a periodic mapping of that timestamp to a real clock. This format could be modified to periodically emit (once a second? once every 5 seconds?) an entry for that mapping: <monotonic stamp, realtime stamp>
That realtime stamp would be emitted using UTC instead of the local timezone. All logs after would calculate the real timestamp from that that offset.
If that makes sense, we can implement this instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should add that the goal is to push the details of interpreting the log stream to the reader and not on conmon the writer.
bb9b2f3 to
5fb760d
Compare
|
Ephemeral COPR build failed. @containers/packit-build please check. |
4e4c88c to
5e306e7
Compare
Instead of parsing the contents of the data read from the `stdout` and `stderr` pipes, this commit adds support for a "stream" format, named `k8s-stream-file`, which just records what is read from a pipe to disk. It significantly saves on CPU spend processing the buffer read, uses only 2 I/O vectors, and never touches the memory read from the pipe. This is an updated implementation of cri-o/cri-o#1605. Signed-off-by: Peter Portante <peter.portante@redhat.com>
5e306e7 to
bfc55e3
Compare
Instead of parsing the contents of the data read from the
stdoutandstderrpipes, this commit adds support for a "stream" format, namedk8s-stream-file, which just records what is read from a pipe to disk.It significantly saves on CPU spend processing the buffer read, uses only 2 I/O vectors, and never touches the memory read from the pipe.
This is a conceptual PR, meant to illustrate a proposed stream log file format that removes the byte level interpretation of
stdout/stderrin favor of simply recording what data was read on each system call. It defers the interpretation of the byte stream to the consumer, allowing this writer to operate with as little overhead as possible (avoiding bad containers that write only newlines, or small numbers of bytes between newlines). The reader of the byte stream is then tasked with reassembling the stream according to whatever interpretation it sees fit to use.The goal of this work is to provide a simple format that will stream well into an object store, such that, given enough metadata stored with the stream, the consumer can reconstruct the I/O stream at the time it is read.
This is an updated implementation of cri-o/cri-o#1605.