Add automatic Scratch.jl fallback for storage#80
Add automatic Scratch.jl fallback for storage#80asinghvi17 wants to merge 2 commits intoEcoJulia:masterfrom
Conversation
- Update README.md to explain automatic storage using Scratch.jl - Make RASTERDATASOURCES_PATH optional in installation instructions - Add comprehensive docstring to rasterpath() function - Update examples in ALWB and MODIS documentation to be more generic - Remove references to get_raster_storage_path() function per plan changes This addresses task 7 from the default-scratch-storage spec.
|
Hey we should merge this, just looks like a Project.toml issue? |
|
Closing in favour of #83 |
| """ | ||
| function rasterpath() | ||
| # Priority 1: Use environment variable if set and valid | ||
| if haskey(ENV, "RASTERDATASOURCES_PATH") && isdir(ENV["RASTERDATASOURCES_PATH"]) |
There was a problem hiding this comment.
I think we might want to special-case when there is a RASTERDATASOURCES_PATH set but it's not a dir. In that case as it stands this will create a scratch directory and potentially start downloading a bunch of data. I think it should just error in that case (as before), since it means a user actively set a path - they might just have made a typo
| scratch_dir = @get_scratch!("raster_data") | ||
| @debug "Using scratch directory for raster data storage: $scratch_dir" |
There was a problem hiding this comment.
Does @get_scratch! make a new directory by default or just return a path? Can we get this to print an @info if scratch_dir doesn't exist yet so first-time users are aware about this hidden directory where potentially many GB will end up
There was a problem hiding this comment.
Yeah, this is a concern and why I never replaced the path, like at least you know what you are doing when you set it manually.
There was a problem hiding this comment.
It depends on package and julia version.
In general we can have it print if the directory is empty though, which is a good indication that nothing has been downloaded yet.
But I really don't think the files RDS downloads are large enough that most users these days will care. Laptops have lots of storage!
There was a problem hiding this comment.
Ah so on a new Julia version the scratch space would change and it would all start downloading again?? That's not great
But I really don't think the files RDS downloads are large enough that most users these days will care. Laptops have lots of storage!
I disagree. Just checked and my CHELSA folder alone has 120 GB. That's a lot. And the thing is it keeps accumulating
There was a problem hiding this comment.
Huh wow, yeah that's substantial. I somehow never downloaded more than 3GB to my folder
There was a problem hiding this comment.
You can still set your path manually, we can warn that you haven't, but IMO not having a default is just user-unfriendly and has caused confusion in the past
There was a problem hiding this comment.
There are multuple 100GB+ datasets here, and some of the weather data must be many terrabytes.
Its very much all in the kill your home directory range so we have to be a bit careful.
Thats the reason there is no default, siliently borking someones system is also very user-unfriendly
Add Scratch.jl fallback if RASTERDATASOURCES_PATH is not present, this removes the most common user footgun.
AI generated.