Skip to content

Output log#64

Open
MofX wants to merge 2 commits intomainfrom
output_log
Open

Output log#64
MofX wants to merge 2 commits intomainfrom
output_log

Conversation

@MofX
Copy link
Contributor

@MofX MofX commented Mar 10, 2025

Changes

Please see this as an RFC, not something I absolutely have to push into the code.

This implements two major changes:

  1. drop of caputre_output paramter of fake.run_cmd:
The main goal here is to not have output written directly to stdout and stderr.

The capture_output parameter does not make a lot of sense.
With capture_output=False the output was written directly to stdout and
with capture_output=True the output was captured and written to the log
and returned from the function.

Now the output (stdout and stderr) is always caputred and returned and
written to the log. I.e. it is not written to stdout anymore.

When an out stream is passed (stdout parameter != None), the output
is not written to the log because it could be binary (i.e. it is used for cpio output).
This is no change from the previous implementation.

Additionally, stdin is set to subprocess.DEVNULL. This prevents processes
executed from changing the terminal settings (i.e. changing the behavior of
\n to just cause newline instead of a new line and carriage return).
Changing the terminal settings can mess up the output during parallel builds.
  1. Allow sending the log output directly to a file
This allows redirecting the output of the tool to a logfile, without
relying on an external process (e.g. bash) to do the redirection.

This also moves exception and banner printing to the log.

It is required to enable clean parallel builds.

Tests results

==============================================================================
Robot Tests                                                                   
==============================================================================
Robot Tests.Boot                                                              
==============================================================================
Kernel exists                                                         | PASS |
------------------------------------------------------------------------------
Config is OK                                                          | PASS |
------------------------------------------------------------------------------
Script was executed                                                   | PASS |
------------------------------------------------------------------------------
Robot Tests.Boot                                                      | PASS |
3 tests, 3 passed, 0 failed
==============================================================================
Robot Tests.Initrd                                                            
==============================================================================
Root Device is Mounted                                                | PASS |
------------------------------------------------------------------------------
Devices are Created                                                   | PASS |
------------------------------------------------------------------------------
Rootfs should be set up                                               | PASS |
------------------------------------------------------------------------------
File dummy.txt should be OK                                           | PASS |
------------------------------------------------------------------------------
File other.txt should be OK                                           | PASS |
------------------------------------------------------------------------------
Modules are installed                                                 | PASS |
------------------------------------------------------------------------------
Check modules loaded by init                                          | PASS |
------------------------------------------------------------------------------
Robot Tests.Initrd                                                    | PASS |
7 tests, 7 passed, 0 failed
==============================================================================
Robot Tests.Root                                                              
==============================================================================
Systemd should exist                                                  | PASS |
------------------------------------------------------------------------------
Fakeoot config executed                                               | PASS |
------------------------------------------------------------------------------
Fakechroot config was executed                                        | PASS |
------------------------------------------------------------------------------
Chroot config was executed                                            | PASS |
------------------------------------------------------------------------------
Resolv.conf symlink should exist                                      | PASS |
------------------------------------------------------------------------------
Robot Tests.Root                                                      | PASS |
5 tests, 5 passed, 0 failed
==============================================================================
Robot Tests                                                           | PASS |
15 tests, 15 passed, 0 failed
==============================================================================

Developer Checklist:

  • Test: Changes are tested
  • Git: Informative git commit message(s)

Reviewer checklist:

  • Review: Changes are reviewed
  • Review: Tested by the reviewer

MofX added 2 commits March 10, 2025 11:47
The main goal here is to not have output written directly to stdout and stderr.

The capture_output parameter does not make a lot of sense.
With capture_output=False the output was written directly to stdout and
with capture_output=True the output was captured and written to the log
and returned from the function.

Now the output (stdout and stderr) is always caputred and returned and
written to the log. I.e. it is not written to stdout anymore.

When an out stream is passed (stdout parameter != None), the output
is not written to the log because it could be binary (i.e. it is used for cpio output).
This is no change from the previous implementation.

Additionally, stdin is set to subprocess.DEVNULL. This prevents processes
executed from changing the terminal settings (i.e. changing the behavior of
\n to just cause newline instead of a new line and carriage return).
Changing the terminal settings can mess up the output during parallel builds.
This allows redirecting the output of the tool to a logfile, without
relying on an external process (e.g. bash) to do the redirection.

This also moves exception and banner printing to the log.

It is required to enable clean parallel builds.
@MofX MofX requested review from simone-weiss and thir820 March 10, 2025 11:50
@simone-weiss
Copy link

Not sure about 1. yet, but 2. makes lot of sense to me

@thir820
Copy link
Contributor

thir820 commented Mar 20, 2025

I would move the Fake part to a library, to reduce the code size and complexity of the build tools and make this reusable, which I prepared as https://github.com/eLiSCHe/libexec. The first topic is already addressed there.


if release_version:
print(f'Thanks for using EB corbos Linux workspace {release_version}!')
logging.info('Thanks for using EB corbos Linux workspace %s!', release_version)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not using logging is intended, to print this on any log level.



def init_logging(level: str = 'INFO') -> None:
def add_loging_arguments(parser: ArgumentParser) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not add this and instead use the bash redirection. I don't see and advantage of this parameter.

],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
stdin=subprocess.DEVNULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would propose to use a dataclass with reasonable defaults, since fake has already much to much parameters.

else:
err = None
out = None
out: int | BufferedWriter = PIPE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PIPE will capture the output, None will print it to stdout and make it instant visible, which is a good idea for long running steps to get the user a instant feedback. Capturing the output always is a bad idea from my point of view.

@thir820
Copy link
Contributor

thir820 commented Mar 20, 2025

@MofX I'm against support for parallel builds, since they make the build only more complex and a user cannot follow it step by step anymore following the live output. The build time advantage is also minimal for local development work, since you will only once build form scratch, and then only rebuild the one changed artifact and it's dependencies. Also the think all steps should print out to stdout directly, and not capture the logs, if possible, to provide instant feedback, and then we can use tee to pipe the logs to a log file additionally.

@matthiasb85
Copy link
Collaborator

matthiasb85 commented Mar 20, 2025

@MofX I'm against support for parallel builds, since they make the build only more complex and a user cannot follow it step by step anymore following the live output. The build time advantage is also minimal for local development work, since you will only once build form scratch, and then only rebuild the one changed artifact and it's dependencies. Also the think all steps should print out to stdout directly, and not capture the logs, if possible, to provide instant feedback, and then we can use tee to pipe the logs to a log file additionally.

  1. Parallel builds may provide a massive improvement of build times. Just take a Hypervisor based image with multiple VMs. Additionally, it would also make sense to run multiple independent image builds in parallel at the same time.
  2. The logs on stdout are not always easy to read. I've had multiple support requests where people showed me the complete wrong part of the stdout since the error did not happen at the end of the output. Having only logfiles, dedicated for the different buildsteps, with a message on stdout like "Hey something went wrong while configuring the rootfs, please look at /a/fancy/path/to/a/logfile/in/the/build/folder for more information" would be much easier to understand and maintain during support requests. Additionally it could also make sense, to only print the logs to stdout after the build has been failed.

@thir820
Copy link
Contributor

thir820 commented Mar 20, 2025

@MofX I'm against support for parallel builds, since they make the build only more complex and a user cannot follow it step by step anymore following the live output. The build time advantage is also minimal for local development work, since you will only once build form scratch, and then only rebuild the one changed artifact and it's dependencies. Also the think all steps should print out to stdout directly, and not capture the logs, if possible, to provide instant feedback, and then we can use tee to pipe the logs to a log file additionally.

  1. Parallel builds may provide a massive improvement of build times. Just take a Hypervisor based image with multiple VMs. Additionally, it would also make sense to run multiple independent image builds in parallel at the same time.
  2. The logs on stdout are not always easy to read. I've had multiple support requests where people showed me the complete wrong part of the stdout since the error did not happen at the end of the output. Having only logfiles, dedicated for the different buildsteps, with a message on stdout like "Hey something went wrong while configuring the rootfs, please look at /a/fancy/path/to/a/logfile/in/the/build/folder for more information" would be much easier to understand and maintain during support requests. Additionally it could also make sense, to only print the logs to stdout after the build has been failed.

To 1.) You can run multiple builds in parallel already. The improvement only helps for the first build, where you need execute all build steps, and if you have good enough hardware and you have build steps which are not already using multiple cores. In the end its a workaround making the build tools more complex and the build process less transparent. A real improvement would it be to speed up the single steps, but also this is not mandatory right now.

To 2.) You would get the same only in a log file, not additionally as today, and it will not make it better readable. The problem is not that the user is sending you the wrong logs, but it is that the user has no idea what happens in the background and needs your help as soon as something goes wrong.
Adding an explicit hint to the logfile as last output of a build, especially on an exit code != 0, is indeed a good idea.

@matthiasb85
Copy link
Collaborator

@MofX I'm against support for parallel builds, since they make the build only more complex and a user cannot follow it step by step anymore following the live output. The build time advantage is also minimal for local development work, since you will only once build form scratch, and then only rebuild the one changed artifact and it's dependencies. Also the think all steps should print out to stdout directly, and not capture the logs, if possible, to provide instant feedback, and then we can use tee to pipe the logs to a log file additionally.

  1. Parallel builds may provide a massive improvement of build times. Just take a Hypervisor based image with multiple VMs. Additionally, it would also make sense to run multiple independent image builds in parallel at the same time.
  2. The logs on stdout are not always easy to read. I've had multiple support requests where people showed me the complete wrong part of the stdout since the error did not happen at the end of the output. Having only logfiles, dedicated for the different buildsteps, with a message on stdout like "Hey something went wrong while configuring the rootfs, please look at /a/fancy/path/to/a/logfile/in/the/build/folder for more information" would be much easier to understand and maintain during support requests. Additionally it could also make sense, to only print the logs to stdout after the build has been failed.

To 1.) You can run multiple builds in parallel already. The improvement only helps for the first build, where you need execute all build steps, and if you have good enough hardware and you have build steps which are not already using multiple cores. In the end its a workaround making the build tools more complex and the build process less transparent. A real improvement would it be to speed up the single steps, but also this is not mandatory right now.

To 2.) You would get the same only in a log file, not additionally as today, and it will not make it better readable. The problem is not that the user is sending you the wrong logs, but it is that the user has no idea what happens in the background and needs your help as soon as something goes wrong. Adding an explicit hint to the logfile as last output of a build, especially on an exit code != 0, is indeed a good idea.

  1. Still not improving the build times for multi VM based setups. Especially GHA based merge verification or release jobs would see the improvement, since you always start from scratch in those builds.
  2. Pointing the customer directly to a buildstep related logfile would make handling the requests much easier, as they mostlikely would provide the logfile instead of only a fraction of the stdout. In general the output is only relevant, when something goes wrong or when a verbose output for debugging is requested (could be the case in a GHA). Other than that I don't see the need of spamming stdout with log messages for every build job, when logs can be accessed on file level.

@thir820
Copy link
Contributor

thir820 commented Mar 20, 2025

@MofX I'm against support for parallel builds, since they make the build only more complex and a user cannot follow it step by step anymore following the live output. The build time advantage is also minimal for local development work, since you will only once build form scratch, and then only rebuild the one changed artifact and it's dependencies. Also the think all steps should print out to stdout directly, and not capture the logs, if possible, to provide instant feedback, and then we can use tee to pipe the logs to a log file additionally.

  1. Parallel builds may provide a massive improvement of build times. Just take a Hypervisor based image with multiple VMs. Additionally, it would also make sense to run multiple independent image builds in parallel at the same time.
  2. The logs on stdout are not always easy to read. I've had multiple support requests where people showed me the complete wrong part of the stdout since the error did not happen at the end of the output. Having only logfiles, dedicated for the different buildsteps, with a message on stdout like "Hey something went wrong while configuring the rootfs, please look at /a/fancy/path/to/a/logfile/in/the/build/folder for more information" would be much easier to understand and maintain during support requests. Additionally it could also make sense, to only print the logs to stdout after the build has been failed.

To 1.) You can run multiple builds in parallel already. The improvement only helps for the first build, where you need execute all build steps, and if you have good enough hardware and you have build steps which are not already using multiple cores. In the end its a workaround making the build tools more complex and the build process less transparent. A real improvement would it be to speed up the single steps, but also this is not mandatory right now.
To 2.) You would get the same only in a log file, not additionally as today, and it will not make it better readable. The problem is not that the user is sending you the wrong logs, but it is that the user has no idea what happens in the background and needs your help as soon as something goes wrong. Adding an explicit hint to the logfile as last output of a build, especially on an exit code != 0, is indeed a good idea.

  1. Still not improving the build times for multi VM based setups. Especially GHA based merge verification or release jobs would see the improvement, since you always start from scratch in those builds.
  2. Pointing the customer directly to a buildstep related logfile would make handling the requests much easier, as they mostlikely would provide the logfile instead of only a fraction of the stdout. In general the output is only relevant, when something goes wrong or when a verbose output for debugging is requested (could be the case in a GHA). Other than that I don't see the need of spamming stdout with log messages for every build job, when logs can be accessed on file level.
  1. If the VMs have the same packages the root_generator caching will kick in, and the build will be faster if you do it sequentially. If CI builds take a few minutes more or less is not relevant form my point of view.
  2. We can reach the same by pointing to the logfile as last output in case of build errors. The live output is relevant for all steps which take a bit longer, to get some feedback that somethings is going on, and to live inspect the logs output. I use this a lot, and I experienced less which was more annoying that to wait for a 15 minutes build until getting some feedback.

@matthiasb85
Copy link
Collaborator

@MofX I'm against support for parallel builds, since they make the build only more complex and a user cannot follow it step by step anymore following the live output. The build time advantage is also minimal for local development work, since you will only once build form scratch, and then only rebuild the one changed artifact and it's dependencies. Also the think all steps should print out to stdout directly, and not capture the logs, if possible, to provide instant feedback, and then we can use tee to pipe the logs to a log file additionally.

  1. Parallel builds may provide a massive improvement of build times. Just take a Hypervisor based image with multiple VMs. Additionally, it would also make sense to run multiple independent image builds in parallel at the same time.
  2. The logs on stdout are not always easy to read. I've had multiple support requests where people showed me the complete wrong part of the stdout since the error did not happen at the end of the output. Having only logfiles, dedicated for the different buildsteps, with a message on stdout like "Hey something went wrong while configuring the rootfs, please look at /a/fancy/path/to/a/logfile/in/the/build/folder for more information" would be much easier to understand and maintain during support requests. Additionally it could also make sense, to only print the logs to stdout after the build has been failed.

To 1.) You can run multiple builds in parallel already. The improvement only helps for the first build, where you need execute all build steps, and if you have good enough hardware and you have build steps which are not already using multiple cores. In the end its a workaround making the build tools more complex and the build process less transparent. A real improvement would it be to speed up the single steps, but also this is not mandatory right now.
To 2.) You would get the same only in a log file, not additionally as today, and it will not make it better readable. The problem is not that the user is sending you the wrong logs, but it is that the user has no idea what happens in the background and needs your help as soon as something goes wrong. Adding an explicit hint to the logfile as last output of a build, especially on an exit code != 0, is indeed a good idea.

  1. Still not improving the build times for multi VM based setups. Especially GHA based merge verification or release jobs would see the improvement, since you always start from scratch in those builds.
  2. Pointing the customer directly to a buildstep related logfile would make handling the requests much easier, as they mostlikely would provide the logfile instead of only a fraction of the stdout. In general the output is only relevant, when something goes wrong or when a verbose output for debugging is requested (could be the case in a GHA). Other than that I don't see the need of spamming stdout with log messages for every build job, when logs can be accessed on file level.
1. If the VMs have the same packages the root_generator caching will kick in, and the build will be faster if you do it sequentially. If CI builds take a few minutes more or less is not relevant form my point of view.

2. We can reach the same by pointing to the logfile as last output in case of build errors. The live output is relevant for all steps which take a bit longer, to get some feedback that somethings is going on, and to live inspect the logs output. I use this a lot, and I experienced less which was more annoying that to wait for a 15 minutes build until getting some feedback.
  1. Downloading has never been a bottleneck so far, neither in CI nor locally. Extracting the packages and running the post install script had a much bigger impact.
  2. Sorry but this is highly subjective.

@thir820
Copy link
Contributor

thir820 commented Mar 20, 2025

@MofX I'm against support for parallel builds, since they make the build only more complex and a user cannot follow it step by step anymore following the live output. The build time advantage is also minimal for local development work, since you will only once build form scratch, and then only rebuild the one changed artifact and it's dependencies. Also the think all steps should print out to stdout directly, and not capture the logs, if possible, to provide instant feedback, and then we can use tee to pipe the logs to a log file additionally.

  1. Parallel builds may provide a massive improvement of build times. Just take a Hypervisor based image with multiple VMs. Additionally, it would also make sense to run multiple independent image builds in parallel at the same time.
  2. The logs on stdout are not always easy to read. I've had multiple support requests where people showed me the complete wrong part of the stdout since the error did not happen at the end of the output. Having only logfiles, dedicated for the different buildsteps, with a message on stdout like "Hey something went wrong while configuring the rootfs, please look at /a/fancy/path/to/a/logfile/in/the/build/folder for more information" would be much easier to understand and maintain during support requests. Additionally it could also make sense, to only print the logs to stdout after the build has been failed.

To 1.) You can run multiple builds in parallel already. The improvement only helps for the first build, where you need execute all build steps, and if you have good enough hardware and you have build steps which are not already using multiple cores. In the end its a workaround making the build tools more complex and the build process less transparent. A real improvement would it be to speed up the single steps, but also this is not mandatory right now.
To 2.) You would get the same only in a log file, not additionally as today, and it will not make it better readable. The problem is not that the user is sending you the wrong logs, but it is that the user has no idea what happens in the background and needs your help as soon as something goes wrong. Adding an explicit hint to the logfile as last output of a build, especially on an exit code != 0, is indeed a good idea.

  1. Still not improving the build times for multi VM based setups. Especially GHA based merge verification or release jobs would see the improvement, since you always start from scratch in those builds.
  2. Pointing the customer directly to a buildstep related logfile would make handling the requests much easier, as they mostlikely would provide the logfile instead of only a fraction of the stdout. In general the output is only relevant, when something goes wrong or when a verbose output for debugging is requested (could be the case in a GHA). Other than that I don't see the need of spamming stdout with log messages for every build job, when logs can be accessed on file level.
1. If the VMs have the same packages the root_generator caching will kick in, and the build will be faster if you do it sequentially. If CI builds take a few minutes more or less is not relevant form my point of view.

2. We can reach the same by pointing to the logfile as last output in case of build errors. The live output is relevant for all steps which take a bit longer, to get some feedback that somethings is going on, and to live inspect the logs output. I use this a lot, and I experienced less which was more annoying that to wait for a 15 minutes build until getting some feedback.
  1. Downloading has never been a bottleneck so far, neither in CI nor locally. Extracting the packages and running the post install script had a much bigger impact.
  2. Sorry but this is highly subjective.
  1. The root generator caches the result of debootstrap as tar archive, and makes use of such a tar archive if one with fitting content is available. If you build a root filesystem for the same repos and containing the same packages twice, debootstrap will only run once, and the second build will just take the cached tarball.
  2. Yes, but at the moment we have both, the live logs and the log file, and there is no reason to give the live logs up.

But there are very good reasons to not have parallel build steps:

  • More complex build tooling.
  • More documentation needed, and more onboarding effort.
  • All dependencies needs to be modeled correctly, which makes image development harder.
  • Risk of race conditions and hard to debug build issues if there are flaws in the dependency modelling.

And on the pro side, for parallel builds, we have only:

  • Maybe a few minutes faster build time for local clean builds (and no impact on the build time if the cache is hot)
  • A few minutes faster build times in the CI (which is not relevant form my point of view)

So overall, form my point of view, we have quite some negative aspects for a minimal improvement.

@thir820
Copy link
Contributor

thir820 commented Mar 20, 2025

By the way, you can run steps parallel using Taskfile, we just make no use of it in our examples because we don't want this complexity. The log files will be OK, just the live output will mix up the logs. This redirecting and tee-ing the live output is done in the tasks, so you can use parallel build steps for your image if you you write proper tasks, and you get the output for each build step separated in different logs, and you can write the log file hint as last step of the build step to the console. There is no need at all to do this in the Python code of the build tools. If you want, you can do this as you want on a per image base.

https://yairdar.github.io/base-tutorials/a-taskfile/c06_deps/

@matthiasb85
Copy link
Collaborator

By the way, you can run steps parallel using Taskfile, we just make no use of it in our examples because we don't want this complexity. The log files will be OK, just the live output will mix up the logs. This redirecting and tee-ing the live output is done in the tasks, so you can use parallel build steps for your image if you you write proper tasks, and you get the output for each build step separated in different logs, and you can write the log file hint as last step of the build step to the console. There is no need at all to do this in the Python code of the build tools. If you want, you can do this as you want on a per image base.

https://yairdar.github.io/base-tutorials/a-taskfile/c06_deps/

I know that you can run the build in parallel, it even works with make. The only annoying part is the log output, as it gets even more unreadable when being mixed up. What is needed to use parallel builds efficiently is a proper log handling. And even for sequential builds this would be desirable. No one is asking for parallel builds from inside the root_generator. It is totally fine to run multiple instances in parallel (orchestrated by some kind of build tooling).

@thir820
Copy link
Contributor

thir820 commented Mar 20, 2025

By the way, you can run steps parallel using Taskfile, we just make no use of it in our examples because we don't want this complexity. The log files will be OK, just the live output will mix up the logs. This redirecting and tee-ing the live output is done in the tasks, so you can use parallel build steps for your image if you you write proper tasks, and you get the output for each build step separated in different logs, and you can write the log file hint as last step of the build step to the console. There is no need at all to do this in the Python code of the build tools. If you want, you can do this as you want on a per image base.
https://yairdar.github.io/base-tutorials/a-taskfile/c06_deps/

I know that you can run the build in parallel, it even works with make. The only annoying part is the log output, as it gets even more unreadable when being mixed up. What is needed to use parallel builds efficiently is a proper log handling. And even for sequential builds this would be desirable. No one is asking for parallel builds from inside the root_generator. It is totally fine to run multiple instances in parallel (orchestrated by some kind of build tooling).

But at the moment, log handling is done in the tasks or in make, and not in the build tools, and there is no reason to change this. If you don't want, you don't have to tee the output, but you can. If you don't want, you can just remove the tee form the cli of the task, and the you only get the log file. At the moment, log handling is out of scope of the build tools where possible, and form my point of view it shall stay this way.
If a the moment a command captures the output where not mandatory, this is a implementation bug which we should fix, and not restrict the freedom of the user without need.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants