cue/load: add support for loading from io/fs.FS#4222
cue/load: add support for loading from io/fs.FS#4222pskry wants to merge 1 commit intocue-lang:masterfrom
Conversation
|
Wow, I am sorry to see this failing on windows. I do not have access to a windows machine, so can't test it locally. I hope someone will review this PR anyways, maybe even give some (much needed) windows insight into pathing there. |
0ddff42 to
ef1dbb0
Compare
|
The Windows CI issues turned out to be path canonicalization problems in the overlay/virtual FS boundary and are now resolved. All tests are green across platforms. This PR should now be ready for review. The change is intended to enable a fairly common use case (loading CUE modules from embedded or virtual filesystems) while keeping existing behavior unchanged when Config.FS is nil. I hope this turns out to be useful for others embedding or sandboxing CUE. Feedback very welcome — especially around the overall approach and any edge cases I may have missed. |
|
Hi! Just wanted to gently bump this in case it got lost in the queue. Thanks for taking a look 🙂 |
|
Hi @pskry, thanks for your patch. We're currently quite busy preparing for upcoming conferences, especially https://cfgmgmtcamp.org/ghent2026/, so it's not likely we will get to this until February. My only recollection is that when @rogpeppe attempted this a couple of years back, he was really struggling with file positions on Windows; Did you encounter a similar issue, and if so, how did you approach it? |
|
I'm not sure what you are referring to. I'm using CUE exclusively as embedded library, not as an external CLI tool. If I would highly appreciate if someone with access to a Windows machine to do a bit of testing/snooping around, since I sadly cannot. |
Allow the loader to operate on a virtual filesystem supplied via Config.FS. When FS is set, all filesystem access is routed through io/fs instead of the host OS filesystem, enabling loading from embed.FS and similar sources. The default behavior is unchanged when FS is nil. Signed-off-by: pskry <peter@skrypalle.dk>
a765c7a to
5de5418
Compare
rogpeppe
left a comment
There was a problem hiding this comment.
Thanks very much for having a bash at this!
I've done a first pass through this, and my main issue is that I'm not sure it's "right" to expose the @fs convention to the user, although I'm not sure what a better solution might look like. Clearly fs.FS doesn't have a notion of absolute paths, but I'm wondering if we should just go with the much-more-conventional / prefix to indicate an absolute path within the fs.FS. That said, then the namespace within the FS overlaps with the namespace in the cache files, so perhaps we do need some unconventional representation for files within the FS. If we do, then it should definitely be a well documented thing, because users and Go callers will see those filenames.
I'm still thinking :)
| return strings.TrimPrefix(p, `/`) | ||
| } | ||
|
|
||
| for strings.Contains(path, loadFSRoot) { |
There was a problem hiding this comment.
I don't understand this logic - shouldn't the @fs prefix only appear at the very start of the path, and always be succeeded by a separator (or end of string) ?
As is, this can go into an infinite loop (try running stripFSRoot on the string foo@fs, for example)
I'm not entirely convinced we should be translating backslashes either.
Couldn't this function just be something like:
func stripFSRoot(path string) string {
prefix, rest, _ := strings.Cut(path, "/")
if prefix == loadFSRoot {
if rest == "" {
return "."
}
return rest
}
return path
}
?
| } | ||
|
|
||
| strip := stripFSRoot(name) | ||
| if strip == "" { |
There was a problem hiding this comment.
Why is this logic different from the "no prefix -> use dot" logic of the rest of the methods?
| } | ||
|
|
||
| func (l loadFS) Stat(name string) (iofs.FileInfo, error) { | ||
| if l.fs == nil { |
There was a problem hiding this comment.
I'm wondering whether, rather than having an if statement in every method, it might be nicer to define loadFS as an interface and have two implementations, one for FS and one for OS. Then we'd have something like:
func (l fsLoadFS) Stat(name string) (iofs.FileInfo, error) {
return iofs.Stat(l.fs, stripFSRoot(name))
}
func (l osLoadFS) Stat(name string) (iofs.FileInfo, error) {
return os.Stat(name)
}
|
|
||
| // Trap host FS access | ||
| old := source.OsOpen | ||
| source.OsOpen = func(name string) (fs.File, error) { |
There was a problem hiding this comment.
It seems like this is the sole reason why we're passing around an open function and defining source.OsOpen as a variable. In general I'd prefer to avoid that kind of pollution of test concerns into the main logic, and the confidence this gives isn't hugely great anyway (the code could potentially just be invoking os.Open directly).
Rather than fake-out OsOpen, how about just using a name that we have good confidence won't be in the local OS filesystem?
|
|
||
| func (fsys assertNoSyntheticFS) Open(name string) (fs.File, error) { | ||
| if strings.HasPrefix(name, loadFSRoot) { | ||
| fsys.t.Fatalf("fs.FS.Open called with synthetic path %q", name) |
There was a problem hiding this comment.
Rather than invoking Fatal in the middle of arbitrary cue/load logic (which might even be in a goroutine in future), I'd suggest calling Errorf instead.
| // and that that all of its parent directories exist too. | ||
| // | ||
| // When FS is set, overlay paths must be absolute loadFS paths | ||
| // (that is, rooted at loadFsRoot). |
There was a problem hiding this comment.
loadFsRoot is a private constant, so this wouldn't make much sense to the reader of the API docs. Also, in general, I'm not sure that it's right that the @fs prefix is exposed to the user. For example, if there are errors evaluating the CUE, the @fs/ prefix will be exposed in the filenames in the errors, so if some tooling wants to do something with those files, it will have to first strip the prefix before it can use them in the filesystem.
Also, I wonder if Overlay is actually useful/required if we've got FS. Perhaps we should forbid supplying both at once.
| return "", fmt.Errorf("invalid io/fs path %q", name) | ||
| } | ||
| return filepath.Join(fs.root, name), nil | ||
|
|
There was a problem hiding this comment.
I suspect this logic isn't needed. This function (absPathFromFSPath) is used by ioFS to translate from fs path to the OS path used by the underlying fileSystem type, but ISTM that ioFS itself is mostly an unnecessary layer when the underlying files are provided by an fs.FS.
But... maybe it's not that easy! The various layers of caching and ReadCUEFile etc cause my brain to melt when I get to this point :)
This PR adds support for loading CUE packages and modules from a virtual
filesystem provided via
load.Config.FS.This enables embedding CUE modules in Go binaries and similar use cases where
access to the host filesystem is not desired or not possible.
When
Config.FSis set, all filesystem operations performed by the loader(stat, read, open, directory walking, and encoding reads) are routed through
io/fs.FSinstead of the host operating system filesystem. This enablesloading CUE code from sources such as
embed.FS,fstest.MapFS, or othervirtual filesystems, without accessing the host filesystem.
The existing behavior is preserved when
Config.FSis nil.Design notes
loadFS) centralizes filesystem access andnormalizes host and virtual filesystem behavior.
loader paths rooted at a synthetic root (
/@fs), avoiding accidental hostfilesystem access.
FSis set.the loader filesystem.
Tests
New tests cover:
fstest.MapFSFSis setConfig.Diros.OpenNo public APIs are removed, and existing behavior is unchanged unless
Config.FSis explicitly set.This change is intended to support embedding CUE modules in Go binaries and similar use cases.