Skip to content

Pass progress message to cli_progress_bar#897

Open
pepijn-devries wants to merge 2 commits intofirelab:mainfrom
pepijn-devries:progressbar
Open

Pass progress message to cli_progress_bar#897
pepijn-devries wants to merge 2 commits intofirelab:mainfrom
pepijn-devries:progressbar

Conversation

@pepijn-devries
Copy link
Contributor

@pepijn-devries pepijn-devries commented Feb 4, 2026

Now includes message to progress bar. As everything has to fit on 1 line, the message is abbreviated to 23 characters.

I also added CLI_SHOULD_TICK to make sure the progress bar isn't updated unnecessarily. Don't think it was really necessary, but just in case...

@ctoney
Copy link
Member

ctoney commented Feb 4, 2026

Thanks. I had a couple of concerns with how this might be implemented but I think you've handled it already. This is mainly to check my understanding and confirm what the usage patterns should be.

GDALTermProgressR is used in two ways. In the majority of cases, a function pointer is passed in a GDAL API call. Those are cases where we wrap API functionality that supports progress reporting via callback. GDAL will not pass a message string in those cases. This form of usage does not necessarily need to be in scope for the current PR (more below).

In other cases we do processing that manages the progress bar directly, updating dfComplete in the loop. GDALRaster::pixel_extract() is one example with the loop beginning at

for (R_xlen_t band_idx = 0; band_idx < num_bands; ++band_idx) {

We don't want to pass the message string in each call to GDALTermProgressR in the loop (i.e., not create the same const char * in each iteration). So we would initialize the progress bar with an initial call before the loop begins, setting dfComplete to 0 and passing the message string which names the progress bar (i.e., in the example, we would add the message string in the call on line 793). Then each subsequent call to the progress function for updating within the loop would pass NULL for the message string (where dfComplete would always be > 0). Correct? (I can go through and add message strings for those cases once this is merged.)

In that first usage mentioned above where we pass a function pointer to GDAL for callback, we could potentially do the same thing and initialize the progress bar (pfnProgress(0, pszMessage, nullptr), where GDAL would then call back to that pfnProgess we give it. I'm guessing that will not work though, since the first call back from GDAL likely has dfComplete = 0 which would reset the progress bar name to "". I'm fine with ignoring that case for now, and only naming the progress bars that we manage directly.

One other question about the code in progress_r.cpp... the global_pb gets released and reset to an R null object when dfComplete gets to 1. If the process is interrupted by an error, or the user interrupts with ctrl-c, I believe the progress bar remains displayed in the terminal at the point it stopped. Is that a problem, e.g., is it a potential memory leak? Or does the object have its own cleanup mechanism and we don't have to worry about that?

@ctoney
Copy link
Member

ctoney commented Feb 4, 2026

I made an edit above, it should be

(where dfComplete would always be > 0). Correct?

instead of > 1, sorry for the confusion

@pepijn-devries
Copy link
Contributor Author

Thanks. I had a couple of concerns with how this might be implemented but I think you've handled it already. This is mainly to check my understanding and confirm what the usage patterns should be.

GDALTermProgressR is used in two ways. In the majority of cases, a function pointer is passed in a GDAL API call. Those are cases where we wrap API functionality that supports progress reporting via callback. GDAL will not pass a message string in those cases. This form of usage does not necessarily need to be in scope for the current PR (more below).

The message argument will be ignored if it is set to nullptr. There are cases were there is a message is passed to the progress bar: gdal_run("vsi copy", c(vsi_f, target)). I'm not entirely sure were the message originates from. Since you are more familiar with the code, you probably do. If the message is a nullptr, it will just send an empty string to initiate the progress bar.

In other cases we do processing that manages the progress bar directly, updating dfComplete in the loop. GDALRaster::pixel_extract() is one example with the loop beginning at

for (R_xlen_t band_idx = 0; band_idx < num_bands; ++band_idx) {

We don't want to pass the message string in each call to GDALTermProgressR in the loop (i.e., not create the same const char * in each iteration). So we would initialize the progress bar with an initial call before the loop begins, setting dfComplete to 0 and passing the message string which names the progress bar (i.e., in the example, we would add the message string in the call on line 793). Then each subsequent call to the progress function for updating within the loop would pass NULL for the message string (where dfComplete would always be > 0). Correct? (I can go through and add message strings for those cases once this is merged.)

The message is only set once, for each progress bar (when it is R_NilValue, or when the dfComplete is still 0). In addition, you are not sending a copy of the message with each call, just a pointer to the message. But when dfComplete > 0, the message is ignored anyway, so it is fine to just pass the nullptr.

In that first usage mentioned above where we pass a function pointer to GDAL for callback, we could potentially do the same thing and initialize the progress bar (pfnProgress(0, pszMessage, nullptr), where GDAL would then call back to that pfnProgess we give it. I'm guessing that will not work though, since the first call back from GDAL likely has dfComplete = 0 which would reset the progress bar name to "". I'm fine with ignoring that case for now, and only naming the progress bars that we manage directly.

Do you have example of where GDAL uses the callback and sets the message itself? Then we can test (and possibly fix) the behaviour.

One other question about the code in progress_r.cpp... the global_pb gets released and reset to an R null object when dfComplete gets to 1. If the process is interrupted by an error, or the user interrupts with ctrl-c, I believe the progress bar remains displayed in the terminal at the point it stopped. Is that a problem, e.g., is it a potential memory leak? Or does the object have its own cleanup mechanism and we don't have to worry about that?

I don't think this setup is causing memory leaks, and I'll explain below why I think this. But it's always better to test if the expected behaviour is also the observed behaviour. I think this can be tested by initiating a progress bar twice (and letting it complete in both cases). If valgrind isn't reporting memory leaks for this test, we are all set. I'm on a Windows machine, so it is really a dread to run those tests. Do you have a CI workflow setup for this package that uses valgrind?

So this is what I expect happens: progress_pb is a global variable and initially set to R_NilValue. If GDALTermProgressR (or a wrapper) is called the first time (or with dfComplete=0) it is initiated with cli_progress_bar. I'm protecting it from the GC with R_PreserveObject. After completing a process and cli_progress_done is called, I'm setting pb_global back to R_NilValue. The c++ object created by the cli package will go out of scope and its reserved memory will eventually be released. If you terminate a process and cli_progress_done is not called, nothing really happens. The c++ object created by cli will just sit there (which is fine). That is, until a new progress bar is initiated with GDALTermProgressR. Then the object is overwritten and the previous object will go out of scope.

But again, this is what I think happens. But to be sure you can test with valgrind.

@ctoney
Copy link
Member

ctoney commented Feb 4, 2026

There are cases were there is a message is passed to the progress bar: gdal_run("vsi copy", c(vsi_f, target))

You're right, I think GDAL will pass back non-null pointer in some or all cases, but it should be empty string. I'll try to confirm whether it may be either/or.

But again, this is what I think happens. But to be sure you can test with valgrind.

We have tests already that create two or more progress bars, one after the other (like a series of tests for a given function, within one test block). Those passed valgrind checks on the first PR, and I can run them again on this one.

By default, cli does not show progress bars that are terminated within two seconds after their creation. So I don't think any of the progress bars in our tests end up being displayed. That should not matter for testing since the object is created anyway? It's also user-configurable with the cli.progress_show_after global option, but it's possibly a good thing that these do not display for the tests on CRAN.

Do you have a CI workflow setup for this package that uses valgrind?

I've been running valgrind only manually using the rhub workflow. I'll be glad to run the tests for this. There is a workflow file in the repo, in case you ever want to run in your fork. If you're in R and on the branch you want to test, then rhub::rhub_check() will display a list of platform options (may require GH authentication/PAT, and rhub::rhub_doctor() can be used first to check). From the list of platforms, 8,9,31 would run clang asan/ubsan and valgrind (currently at least, sometimes the numbers change when new ones are added). M1san is also an option, I think that is 2.

@ctoney
Copy link
Member

ctoney commented Feb 5, 2026

You're right, I think GDAL will pass back non-null pointer in some or all cases, but it should be empty string. I'll try to confirm whether it may be either/or.

It looks like GDAL will pass back either nullptr or empty string "" for the message. Empty string may be standard in the newer code for the CLI algorithms (apps/gdalalg_*.cpp). But both are present in the rest of the code base. E.g., empty string

https://github.com/OSGeo/gdal/blob/7acac4265a1d17998e89664d479443bd4e3ea960/apps/ogr2ogr_lib.cpp#L932

https://github.com/OSGeo/gdal/blob/7acac4265a1d17998e89664d479443bd4e3ea960/gcore/rawdataset.cpp#L1198

or nullptr

https://github.com/OSGeo/gdal/blob/7acac4265a1d17998e89664d479443bd4e3ea960/apps/gdalbuildvrt_lib.cpp#L1777

https://github.com/OSGeo/gdal/blob/7acac4265a1d17998e89664d479443bd4e3ea960/apps/gdaldem_lib.cpp#L707

https://github.com/OSGeo/gdal/blob/7acac4265a1d17998e89664d479443bd4e3ea960/apps/nearblack_lib.cpp#L460

sometimes both in the same file, e.g.,

https://github.com/OSGeo/gdal/blob/master/gcore/rasterio.cpp

In addition, you are not sending a copy of the message with each call, just a pointer to the message.

Right my bad, we would not incur overhead even if the message were passed inside the loop as pointer to the string. When I update the existing code to add message strings, I think it should follow a consistent pattern of initializing dfComplete = 0 and setting the message string before the loop, and then the update call inside the loop will stay as-is passing nullptr for the message. I see that existing code is inconsistent with respect to initialization before the loop, which is not ideal.

@ctoney
Copy link
Member

ctoney commented Feb 5, 2026

asan, ubsan and valgrind pass on the existing test set
https://github.com/ctoney/gdalraster/actions/runs/21697470380

@pepijn-devries
Copy link
Contributor Author

You are correct. This behaviour is problematic when a progress bar is terminated prematurely, and a next progress bar is not initialised with dfComplete = 0. If dfComplete = 0 is not guaranteed for new progress bars, this is a problem because in those cases, the message of a previously terminated process may be displayed.

The root cause is that the GDAL API uses 1 callback function for handling the progress bar, and not dedicated functions for initialising and terminating them. At least, I don't think there are.

Maybe, close this PR for now until we can come up with a better sollution.

@ctoney
Copy link
Member

ctoney commented Feb 5, 2026

Sure no problem, thanks for investigating it this far. I'll test with it some more before deciding to close if that's alright.

I named the cli progress bar when I added it for calc(), which is R code that previously used utils::txtProgressBar (#895). It's definitely in the category of nice to have but not essential. If we can use the new progress bar reliably even without naming that's a big plus IMO.

At the same time it's tempting to learn how something new works and whether there are implications for the existing code or tests. For example as I mentioned above, it revealed a couple of places where the existing progress bar is managed directly but dfComplete is not initialized to 0 before the loop. I'll fix that at minimum.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants