Evolving the dataset formats by jungerm2 · Pull Request #26 · wision-lab/visionsim

jungerm2 · 2026-02-05T21:06:59Z

Dataset format changes:
Metadata from the simulation is not saved, nor returned as a transforms.json file anymore. Instead, metadata is directly saved to a SQLite database when ground truths are rendered. This new format, composed of 3 tables defined in code in simulate/schema.py, allows many workers to write metadata to shared databases without contention. These new databases are defined per data type, one for depths, another for normals, etc. This disentagles that datasets, with each having it's independant metadata. Previously, ground-truth intensity frames and composited frames were confounded: if a blender scene had an existing compositor tree set up, for example, to add glare or change the image contrast, this was saved as the ground-truth frames instead of the unadulterated render. This now corresponds to a new "composites" output type, which comes with its own include_composites() method.

The directory structure has also changed, all data is now saved in numbered subfolders, with previews (previously dubbed "debug" views) saved in a separate folder (only shown for composites here for brevity):

└── <SCENE-NAME>
    ├── composites
    │   ├── 0000
    │   │   ├── 000.png
    │   │   ├── 001.png
    │   │   └── ...
    │   ├── 0001/
    │   ├── ....
    │   └── transforms.db
    ├── depths/
    ├── flows/
    ├── frames/
    ├── normals/
    ├── segmentations/
    └── previews
        ├── depths/
        ├── flows/
        │   └── forward/
        ├── normals/
        └── segmentations/

A Metadata class, and its related classes, namely Camera, Data, Frame, have been added in dataset/models.py as the main way to interact with the metadata files, both the .db and .json variants. These classes are Pydantic models which mirror the data schema used by the simulation code, provide data validation capabilities, extend and supersede the previous Nerfstudio-esque json schema, and provide utilities to convert between the formats. Users should use these classes instead of directly interacting with the schema variants.

That dataset loaders have been re-written to accommodate these changes too, with NpyDataset or ImgDataset being replaced by a combined Dataset class which can support, and load data, that is either a set of images/exrs or a properly formed .db or .json dataset, supporting images/exrs/and numpy formats (bitpacked too!).

CLI Changes:

Remove dataset CLI imgs-to-npy and npy-to-imgs commands,
Add CLI for dataset.convert to go from a .db to a .json dataset
Emulate CLI:
- Force emulate.spad to save bitpacked npy's
- All emulate tasks now work on a dataset or directory of frames, using the --pattern switch
- All emulate tasks now respect the --force flag
Rename interpolate.frames to interpolate.dataset, make it work with frames and datasets, using the --pattern switch
Transforms CLI:
- Remove complicated dataloading schemes with in-collate processing in favor of simpler code (potentially slower)
- Change all default patterns to account for nested data (eg: "flow_.exr" -> "**/.exr")
- Rename transforms.tonemap-exrs to transforms,tonemap-frames to make it clear this is for image data, not depths (also works with composites)

MISC:

Use ElapsedProgress instead of Progress where possible.
Use Path as inputs for CLI tasks directly, offloading conversions to tyro
Remove opencv (only used by rife for image loading, replaced with imageio)
Remove torchvision (unused)
Remove jsonschema (replaced by pydantic)
Clean up typing/typing_extensions/collections.abc imports
Move interpolate_poses into pose.py
Clean up unused rife code + enable it to work with a list of frames OR using input-dir and pattern, which allows for saving interp'd frames to subfolders
Add py.typed to enable end-users to use vsim types

TODO:

Add docstrings to all new methods, ensure they render properly in Sphinx docs
Refine schema/model interplay:
- We can/should add extra info in the schema (fps, key-frame multiplier, arclen?, etc), especially since dataset.info and blender.sequence_info have been removed.
Add dataset.merge CLI that can merge different (but similar!) transforms files, renaming the path parameters as needed.
- Note: This will break if there are any per-frame attributes such as bitpack_dim or offset. These only relate to NPY dataformats, so perhaps this is an OK limitation for now.
Add additional tests as needed
Validate that all new code passes typing checks, add types where needed, and add type checking to CI.
Update Data Format and Loading docs, merge with data conventions PR

Will punt on the following:

Emulate SPAD needs to handle Alpha + binomial frames (see this PR)
Only PINHOLE camera is supported at the moment. What info to save for other camera types? How to convert to nerfstudio-esque json format? The distortion parameters in the schema are currently unused. Further, Blender's fisheye model is different than OpenCV's.

Questions:

How future-proof is this new data format? Can it be applied to all sensor modalities we wish to emulate?
What additional attributes should we extract/capture in the .db vs the .json formats?
~~Is the naming confusing around schema.Metadata vs models.Metadata (same for the other classes)? They mirror one another, but I don't want users accidentally importing the wrong one.~~ These have been marked as private.

📚 Documentation preview 📚: https://visionsim--26.org.readthedocs.build/en/26/

…nder

jungerm2 · 2026-02-09T20:55:22Z

@shantanu-gupta I'm still working my way through the TODOs above, but since we chatted about these features, let me know if you have any high-level feedback on the implementation.

shantanu-gupta · 2026-02-09T23:57:17Z

@jungerm2 Thanks for looping me in. I'll take a look.

Just to start:

Force emulate.spad to save bitpacked npy's

This should first confirm that the SPAD data is being generated with 1-bit precision; in the other PR I do take care of this. There is sometimes a practical case for not always generating binary data for SPAD, but instead using 3-bit or 5-bit data at a correspondingly lower frame rate.

jungerm2 · 2026-02-10T00:33:38Z

Agreed. We'll have to merge with that branch anyways so I've punted on a few things in emulate.spad intentionally.

…

________________________________ From: Shantanu Gupta ***@***.***> Sent: Monday, February 9, 2026 5:57:40 PM To: wision-lab/visionsim ***@***.***> Cc: Sacha Jungerman ***@***.***>; Mention ***@***.***> Subject: Re: [wision-lab/visionsim] Evolving the dataset formats (PR #26) [https://avatars.githubusercontent.com/u/2587973?s=20&v=4]shantanu-gupta left a comment (wision-lab/visionsim#26)<https://urldefense.com/v3/__https://github.com/wision-lab/visionsim/pull/26*issuecomment-3874522792__;Iw!!Mak6IKo!I7ytnEwNxLJzYby4snnkJtCEMHnAdO1TADyOWl-ff2CCtMNB6_VRlYK9JyDL2Tqoy0EMng1zlECKL_QLdYBh1Z_eVe8$> @jungerm2<https://urldefense.com/v3/__https://github.com/jungerm2__;!!Mak6IKo!I7ytnEwNxLJzYby4snnkJtCEMHnAdO1TADyOWl-ff2CCtMNB6_VRlYK9JyDL2Tqoy0EMng1zlECKL_QLdYBhyjcZpM0$> Thanks for looping me in. I'll take a look. Just to start: Force emulate.spad to save bitpacked npy's This should first confirm that the SPAD data is being generated with 1-bit precision; in the other PR I do take care of this. There is sometimes a practical case for not always generating binary data for SPAD, but instead using 3-bit or 5-bit data at a correspondingly lower frame rate. — Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/wision-lab/visionsim/pull/26*issuecomment-3874522792__;Iw!!Mak6IKo!I7ytnEwNxLJzYby4snnkJtCEMHnAdO1TADyOWl-ff2CCtMNB6_VRlYK9JyDL2Tqoy0EMng1zlECKL_QLdYBh1Z_eVe8$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AEESDKMVU6UWQYPOSKYJPWT4LENHJAVCNFSM6AAAAACUEULDOWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTQNZUGUZDENZZGI__;!!Mak6IKo!I7ytnEwNxLJzYby4snnkJtCEMHnAdO1TADyOWl-ff2CCtMNB6_VRlYK9JyDL2Tqoy0EMng1zlECKL_QLdYBhyDXwZNs$>. You are receiving this because you were mentioned.Message ID: ***@***.***>

jungerm2 · 2026-02-19T20:23:12Z

I think this PR's just about done. The plan is to merge it with main, fast-forward the sensor sim branch and merge it there, then update the quickstart docs as needed. Thoughts @shantanu-gupta ?

jungerm2 · 2026-02-23T16:52:56Z

I think I've addressed most of the issues here, so I'll go ahead and merge this and go work on the other branch. Going forward, it might be a good idea to add some database migrations when we modify the schema. I added the structure to support this with example code in the comments.

jungerm2 added 18 commits January 15, 2026 15:57

add links to docs

8e8ebd7

add camera conventions docs

586ac90

Merge branch 'main' into coords

4223147

Merge branch 'main' into coords

274ec8e

rename debug visualizations to preview, save GTs using subdirs

b1ae0d9

split gt frame/composite outputs, add per gt options to CLI

2b94f93

include frames in examples

0f33539

replace transforms/intrinsics with camera_info, yield cam data per re…

4d72d9e

…nder

save transforms as db

84e7326

revamp data loaders and format

dd27402

add typing

0cef1a2

fix matrix4x4 alias type py3.9

361e8ae

fix matrix4x4 alias type py3.9, v2

71cdacd

allow stdout logs, fix some typing

0717ce6

revert to explicit type union in typealias

b8806fd

fix typing issues

6cef821

add type checking to CI

1cfa193

add docs, fix 3.11 typing issues

26409c9

add docstrings

9e88bd5

jungerm2 added 3 commits February 11, 2026 12:33

add docstrings

4184329

Merge branch 'coords' into datasets

5e0b3c1

merge with PR@24, make coord transform private, update data loading docs

c4fef81

jungerm2 mentioned this pull request Feb 12, 2026

Add coordinate conventions documentation #24

Closed

jungerm2 added 4 commits February 12, 2026 11:57

fix fps null constraint, make 'c' optional too

f2c39d3

do not overwrite default render resolution

be0edae

refactor docstring tests, fix docstrings

40abccc

fix docstring test, linting

9d5590e

jungerm2 added 8 commits February 12, 2026 15:13

cast tuple to size 2

b0c249b

fix threading in py3.14+ to fork

ce59c6a

fix patterns, add log level envvar

16ba863

add dataset.merge CLI

40fe431

revert to default process start method, add main guards instead

90f51cc

add composite nodes to docs

e4c98df

track keyframe multiplier

b8d2997

add arclength property

a30fe71

jungerm2 added 7 commits February 23, 2026 08:28

add dvs docs

218483a

ffmpeg.animate from datasets (including npys), add hide to _run

8ec54c7

make schema classes private

459b33b

update quickstart

b3a0d1a

fix lints/type-hints

3c3798a

fix imports

1fc3efb

add migrations scafolding

f194f8e

jungerm2 merged commit bd50b1c into main Feb 23, 2026
13 checks passed

jungerm2 deleted the datasets branch February 23, 2026 16:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evolving the dataset formats#26

Evolving the dataset formats#26
jungerm2 merged 41 commits intomainfrom
datasets

jungerm2 commented Feb 5, 2026 •

edited by github-actions bot

Loading

Uh oh!

jungerm2 commented Feb 9, 2026

Uh oh!

shantanu-gupta commented Feb 9, 2026

Uh oh!

jungerm2 commented Feb 10, 2026 via email

Uh oh!

jungerm2 commented Feb 19, 2026

Uh oh!

jungerm2 commented Feb 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jungerm2 commented Feb 5, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jungerm2 commented Feb 9, 2026

Uh oh!

shantanu-gupta commented Feb 9, 2026

Uh oh!

jungerm2 commented Feb 10, 2026 via email

Uh oh!

jungerm2 commented Feb 19, 2026

Uh oh!

jungerm2 commented Feb 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jungerm2 commented Feb 5, 2026 •

edited by github-actions bot

Loading