Skip to content

H5 and mat files visualisable, read path from workflow.yaml#990

Merged
milesAraya merged 2 commits intodevelop-featurefrom
feature/hdf5-mat-visualise
Mar 17, 2026
Merged

H5 and mat files visualisable, read path from workflow.yaml#990
milesAraya merged 2 commits intodevelop-featurefrom
feature/hdf5-mat-visualise

Conversation

@milesAraya
Copy link
Copy Markdown
Collaborator

Content

Summary

  • HDF5 and MAT input files can now be visualised directly on the Visualise page by selecting them from the data dropdown, without needing algorithm node outputs.
  • A new backend endpoint (GET /outputs/structured/{workspace_id}/{unique_id}/{node_id}) reads the dataset path (hdf5Path/matPath) from the saved workflow.yaml and returns the data with automatic type detection (1D -> bar, 2D -> timeseries, 3D -> images).
  • 3D image data uses pagination (start_index/end_index) matching the existing TIFF pattern, loading 10 frames at a time to avoid multi-megabyte JSON responses.
  • A new StructuredFilePlot frontend component renders the data using Plotly (line chart for timeseries, heatmap with slider for images, bar chart for 1D).

Design Decisions

  • Reads from workflow.yaml instead of re-uploading data -- The workflow config already stores the file path and dataset path (hdf5Path/matPath) for each input node. The endpoint resolves the data at request time from these saved references, avoiding duplication.
  • Pagination for 3D data only -- 2D and 1D data are small enough to return in full. 3D image stacks (e.g. 500x128x128) would produce ~50MB JSON responses, so the same start_index/end_index pattern used by the TIFF image endpoint is applied. Default window is 10 frames.
  • ndim determined before slicing -- The original array dimensionality is captured before any pagination slicing so the response data_type is always correct. total_frames is returned for 3D data so the frontend knows the full dataset size.
  • Keyed by itemId not filePath -- Unlike other display data types that use file paths as Redux keys, structured data uses the visualise item ID. This avoids collisions when the same input file appears in multiple visualise items.
  • StructuredFilePlot follows ImagePlot patterns -- Uses LinearProgress for loading, useEffect for data fetching with the same guard pattern (workspaceId && uid && nodeId), and PlotlyChart for rendering.

Evidence

  • Backend endpoint tested manually against real tutorial4 data (HDF5: 500x128x128 images, MAT: 500x8 timeseries)
  • Paginated HDF5 response: ~1MB (10 frames) vs ~50MB (all 500 frames)
  • MAT timeseries response: ~32KB
  • All 11 backend tests and 17 frontend test suites pass

References

  • Tutorial 4 workflow (suite2p + CCA with HDF5 image input and MAT behavior input)
  • Existing TIFF pagination: GET /outputs/image/{filepath}?start_index=1&end_index=10

Files changed

Backend
  • studio/app/common/routers/outputs.py -- New GET /outputs/structured/ endpoint with pagination support for 3D data; reads hdf5Path/matPath from workflow config
Frontend
  • frontend/src/api/outputs/Outputs.ts -- New StructuredData type and getStructuredDataApi function with startIndex/endIndex params
  • frontend/src/store/slice/DisplayData/DisplayDataActions.ts -- New getStructuredData async thunk
  • frontend/src/store/slice/DisplayData/DisplayDataSlice.ts -- Reducers for structured data (pending/fulfilled/rejected) and cleanup handling for HDF5/MATLAB types
  • frontend/src/store/slice/DisplayData/DisplayDataType.ts -- New StructuredDisplayData interface and structured state field
  • frontend/src/components/Workspace/Visualize/Plot/StructuredFilePlot.tsx -- New component: fetches structured data and renders as timeseries (line), images (heatmap + slider), or bar chart
  • frontend/src/components/Workspace/Visualize/DisplayDataItem.tsx -- Route HDF5/MATLAB data types to StructuredFilePlot
  • frontend/src/components/Workspace/Visualize/VisualizeItemEditor.tsx -- Add HDF5/MATLAB to SaveFig editor cases
Tests
  • studio/tests/app/common/routers/test_structured_outputs.py -- New: 11 tests covering HDF5 (1D/2D/3D), MAT files, pagination, default params, and error cases (missing workflow/node/file, bad dataset path, no hdf5Path/matPath)
  • frontend/src/components/Workspace/__tests__/Experiment/ExperimentTable.test.tsx -- Added structured: {} to mock DisplayData state

Manual Testcases

  • Run tutorial4 workflow with HDF5 and MAT input files
  • Navigate to the Visualise page
  • Click "+" to add a new visualise item
  • From the "Select Item" dropdown, select sample_hdf5.h5 -- verify a heatmap image renders with a frame slider (10 frames loaded)
  • Add another item, select sample_matlab.mat -- verify a timeseries line chart renders (500 rows x 8 columns)
  • Verify algorithm outputs (suite2p images, CCA scatter/bar) still render correctly from the dropdown
  • make test_backend -- all backend tests pass (including 11 structured output tests)
  • make test_frontend -- all 17 frontend test suites pass (79 tests)
Screenshot 2026-03-16 at 15 29 24 Screenshot 2026-03-16 at 15 29 30

Unit, Integration, Contract Test Coverage

Area Coverage
HDF5 1D/2D/3D data types Unit tests via TestClient
MAT file via matPath Unit test via TestClient
3D pagination (custom range) Unit test
3D pagination (default params) Unit test
Error: missing workflow Unit test (404)
Error: missing node Unit test (404)
Error: missing file Unit test (404)
Error: bad dataset path Unit test (404)
Error: no hdf5Path/matPath Unit test (400)
Frontend Redux state shape ExperimentTable test (structured field)

Others

Difficulties

  • The frontend dev server (.env.development) was configured for ports 3001/8001 but docker-compose.dev.multiuser.yml mapped port 3000, causing the dev server to be unreachable. This made it appear that code changes weren't taking effect when the user was accessing the app through the backend's production build (port 8000) instead.
  • The HDF5 image dataset (500x128x128 uint16) serialised to ~50MB of JSON, which would hang the browser. Resolved by adding pagination matching the existing TIFF pattern.

Risk Assessment

Area Risk Notes
Structured data endpoint Low New endpoint, no changes to existing endpoints; isolated code path
3D pagination Low Follows established TIFF pattern; default 10 frames matches existing behaviour
Redux structured state Low New state field with no interaction with existing display data types
StructuredFilePlot component Low New component, only rendered for HDF5/MATLAB types; no impact on existing plots
MAT file reading via MatGetter Low Reuses existing MatGetter.data() which is already used by the /mat/ endpoint

@milesAraya milesAraya self-assigned this Mar 16, 2026
<PlotlyChart data={plotData} layout={layout} />
{data.length > 1 && (
<Box px={2}>
<Slider
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Play slider's style is slightly different from existing image plots, etc. Is that okay?

*While it may not be extremely important, I think keeping the UI somewhat similar will make it easier for users to understand.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced the simple with the full PlayBack pattern from ImagePlot — Play/Pause buttons, msec/frame TextField, and the same with marks, valueLabelDisplay="auto", and step={1}.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The content seems fine.

I made some minor code adjustments here.

  • Componentizing the player control
  • Style adjustments (similar to ImagePlot)
image

const DEFAULT_START_INDEX = 0
const DEFAULT_END_INDEX = 10

export const StructuredFilePlot = memo(function StructuredFilePlot() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plotting of various images is good.

Also, it might be useful to display the internal paths of the structured data along with the plot.
*Please skip this suggestion if it's not particularly useful.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backend now returns dataset_path (e.g. data/image, data/behavior) in every response. The component shows it as a caption above the plot.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The content seems fine.
I only made a slight adjustment to the label's display style.

image

const DEFAULT_START_INDEX = 0
const DEFAULT_END_INDEX = 10

export const StructuredFilePlot = memo(function StructuredFilePlot() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The content seems fine.
I only made a slight adjustment to the label's display style.

image

<PlotlyChart data={plotData} layout={layout} />
{data.length > 1 && (
<Box px={2}>
<Slider
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The content seems fine.

I made some minor code adjustments here.

  • Componentizing the player control
  • Style adjustments (similar to ImagePlot)
image

@milesAraya milesAraya merged commit 45cb0e2 into develop-feature Mar 17, 2026
5 checks passed
@milesAraya milesAraya deleted the feature/hdf5-mat-visualise branch March 17, 2026 04:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants