Skip to content

Conversation

@gadorlhiac
Copy link
Collaborator

@gadorlhiac gadorlhiac commented Sep 30, 2025

Description

This PR creates SFX workflows using Cheetah. It also allows for cheetah to be configured to run compression.

That latter feature requires merging this omdevteam/om#17 PR in OM.

Checklist

  • Update cheetah templates to support compression.
  • SFX workflows with cheetah, and a version with forced conversion to XTC2 to support compression, since libpressio requires the psana2 environment for some operation modes.
  • Update parameter models as needed.
  • Switch the default database specification to v2
  • Prepare smalldata_tools integrations for compression and fix various bugs
  • Update the setup script to account for the change to maestro

PR Type:

  • New feature/Enhancement

Address issues:

Testing

XPP Smalldata compression (workflow via maestro)

Running it:

> launch_slurm -c config/xpp_compression.yaml -W workflows/common/xpp_compression.dag -e xppx1003621 -r 197 --account=lcls:data --partition=milano --ntasks=5 --nodes=1
[2025-11-24 10:18:09.242] [LWM:Manager] [info] Running workflows with SlurmLauncher.
[2025-11-24 10:18:09.242] [HTTP:Server] [info] Starting server on 0.0.0.0:41239 with 5 threads, using 64 shards, with a backlog size of 1000 and 10000 maximum events.
[2025-11-24 10:18:09.243] [LWM:Manager] [info] Beginning workflow.
[2025-11-24 10:18:09.243] [LWM:SlurmLauncher] [info] Will launch Xtc1to2Converter with: /sdf/scratch/users/d/dorlhiac/work/lute_smdconv/install/bin/submit_slurm.sh --taskname Xtc1to2Converter --config config/xpp_compression.yaml --account=lcls:data --partition=milano --ntasks=5 --nodes=1
# ...
INFO:lute.execution.executor:TaskStatus.COMPLETED
ERROR:lute.io.elog:eLog Update Failed! JID_UPDATE_COUNTERS is not defined!
INFO:lute.execution.executor:Exiting after Task completion.
TASK_LOG -- INFO:lute.tasks.xtc: Conversion completed.

Time Xtc1to2Converter spent: 
- Pending: 14 s
- Running: 14 s
# ...

Look at HDF5 files

> h5ls /sdf/data/lcls/ds/xpp/xppx1003621/hdf5/with_compression/xppx1003621_Run0197.h5/jungfrau1M_alcove
ROI_111peak_area         Dataset {10, 193, 256}
ROI_111peak_com          Dataset {10, 2}
ROI_111peak_max          Dataset {10}
ROI_111peak_mean         Dataset {10}
ROI_111peak_sum          Dataset {10}
ROI_211peak_area         Dataset {10, 193, 256}
ROI_211peak_com          Dataset {10, 2}
ROI_211peak_max          Dataset {10}
ROI_211peak_mean         Dataset {10}
ROI_211peak_sum          Dataset {10}
ROI_224peak_area         Dataset {10, 175, 256}
ROI_224peak_com          Dataset {10, 2}
ROI_224peak_max          Dataset {10}
ROI_224peak_mean         Dataset {10}
ROI_224peak_sum          Dataset {10}
ROI_232peak_area         Dataset {10, 295, 459}
ROI_232peak_com          Dataset {10, 2}
ROI_232peak_max          Dataset {10}
ROI_232peak_mean         Dataset {10}
ROI_232peak_sum          Dataset {10}
ROI_air_scatter_bottom_area Dataset {10, 463, 350}
ROI_air_scatter_bottom_com Dataset {10, 2}
ROI_air_scatter_bottom_max Dataset {10}
ROI_air_scatter_bottom_mean Dataset {10}
ROI_air_scatter_bottom_sum Dataset {10}
ROI_air_scatter_larger_area Dataset {10, 177, 567}
ROI_air_scatter_larger_com Dataset {10, 2}
ROI_air_scatter_larger_max Dataset {10}
ROI_air_scatter_larger_mean Dataset {10}
ROI_air_scatter_larger_sum Dataset {10}
ROI_large_scatter_area   Dataset {10, 220, 232}
ROI_large_scatter_com    Dataset {10, 2}
ROI_large_scatter_max    Dataset {10}
ROI_large_scatter_mean   Dataset {10}
ROI_large_scatter_sum    Dataset {10}

Workflow definition

!LUTE_DAG
task_name: "ConvertXtc1to2"
slurm_params: ""
next:
- task_name: "SmallDataProducerSpack"
  slurm_params: ""
  next: []

YAML Configuration

%YAML 1.3
---
title: "Config to run smalldata_tools on converted XTC1 files."
experiment: "xppx1003621"
run: "{{ $RUN_NUM }}"
date: "2025/11/14"
lute_version: 0.1      # Do not be change unless need to force older version
task_timeout: 6000
work_dir: "/sdf/data/lcls/ds/xpp/xppx1003621/results/lute_output"
...
---
# We will define some convenience keys for substitution in other parameters
EXPERIMENT_DIR: "/sdf/data/lcls/ds/xpp/{{ experiment }}"
FAKE_PSDM_SUBDIR: "xpp/{{ experiment }}/xtc"
XTC2_FILE_PATH: "{{ EXPERIMENT_DIR }}/scratch/conversion/{{ FAKE_PSDM_SUBDIR }}"
XTC2_FILE_NAME: "{{ experiment }}-r{{ run:04d }}-s000-c000.xtc2"

ConvertXtc1to2:               # All variables are given as strings
  node_id: "1"                # Node ID for the detector
  #eventfile: ""
  nevents: 10
  output_file: "{{ XTC2_FILE_PATH }}/{{ XTC2_FILE_NAME }}"
  xtc1_access_pattern:
    jungfrau1M_alcove: # Name of the detector in the converted XTC2
    # You can have a list of attributes you will convert that will be stored in
    # this detector
      - xtc2_attr_name: "calib"          # Name of this attribute in xtc2
        object_name: "jungfrau1M_alcove" # Name of the detector in psana1
        object_type: "psana.Detector"    # Name of the object type in psana1
        object_field_name: "calib"       # Name of the per-event method to use in psana1
    #EBeam:
    #  - xtc2_attr_name: "photon_energy"
    #    object_name: "EBeam"
    #    object_type: "psana.Detector"
    #    object_field_name: ["get","ebeamPhotonEnergy"]

SubmitSMD:
  # Command line arguments
  #map_by: "core"   # MPI resource mapping - take care with changing unless familiar
  #bind_to: "core"  # MPI resource binding - take care with changing unless familiar
  #np: 5
  producer: "/sdf/data/lcls/ds/xpp/xppx1003621/scratch/smalldata_tools/lcls2_producers/smd_producer.py"
  run: "{{ run }}"
  experiment: "{{ experiment }}"
  #stn: 0
  #directory: "/sdf/data/lcls/ds/xpp/xppx1003621/hdf5/no_compression"
  directory: "/sdf/data/lcls/ds/xpp/xppx1003621/hdf5/with_compression"
  psdm_dir: "{{ XTC2_FILE_PATH }}"
  #config: "mfx_cctbx"
  #gather_interval: 25
  #norecorder: False
  #url: "https://pswww.slac.stanford.edu"
  #epicsAll: False
  #full: False
  #fullSum: False
  #default: true
  #image: False
  #tiff: False
  #centerpix: False
  #postRuntable: False
  #wait: False
  #xtcav: False
  #noarch: False
  # Producer variables. These are substituted into the producer to run specific
  # data reduction algorithms. Uncomment and modify as needed.
  # If you prefer to modify the producer file directly, leave commented.
  # Beginning with `getROIs`, you will need to modify the first entry to be a
  # detector. This detector MUST MATCH one of the detectors in `detnames`.
  # In the future this will be automated. If you have multiple detectors you can
  # add them with their own set of parameters.

  detnames: ["jungfrau1M_alcove"]
  # Detector sum images - per detector
  #detSumAlgos:
  #  jungfrau1M_alcove:
  #    - "calib"
  #    - "calib_max"
  # Setup the ROIs
  getROIs:
    jungfrau1M_alcove:   # Change to detector name
      - ROI: [[[1,2], [37, 230], [448, 704]]]
        name: "ROI_111peak"
        writeArea: True   # Whether to save ROI, if False, save sum but not img.
        thresADU: 8
        calcPars: True
      - ROI: [[[1,2], [37, 230], [448, 704]]]
        name: "ROI_211peak"
        writeArea: True   # Whether to save ROI, if False, save sum but not img.
        thresADU: 8
        calcPars: True
      - ROI: [[[1,2], [175,470], [26,  485]]]
        name: "ROI_232peak"
        writeArea: True   # Whether to save ROI, if False, save sum but not img.
        thresADU: 8
        calcPars: True
      - ROI: [[[0,1], [328,503], [370, 626]]]
        name: "ROI_224peak"
        writeArea: True   # Whether to save ROI, if False, save sum but not img.
        thresADU: 8
        calcPars: True
      - ROI: [[[0,1], [76, 296], [778,1010]]]
        name: "ROI_large_scatter"
        writeArea: True   # Whether to save ROI, if False, save sum but not img.
        thresADU: 8
        calcPars: True
      - ROI: [[[1,2], [23, 486], [639, 989]]]
        name: "ROI_air_scatter_bottom"
        writeArea: True   # Whether to save ROI, if False, save sum but not img.
        thresADU: 8
        calcPars: True
      - ROI: [[[1,2], [7,  184], [289, 856]]]
        name: "ROI_air_scatter_larger"
        writeArea: True   # Whether to save ROI, if False, save sum but not img.
        thresADU: 8
        calcPars: True
  getAzIntParams:
    jungfrau1M_alcove:
      eBeam: 9.4
      center: [26167.58, -30407.6] # um
      dis_to_sam: 45.0
      tx: 0
      ty: 0
  # Compression arguments. Comment this entire block if no compression wanted
  getPressioCompression:
    jungfrau1M_alcove:
      compressor_id: "sz3"
      # Specific arguments vary depending on compressor_id
      compressor_args:
        abs_error_bound: 10

Screenshots

gadorlhiac and others added 30 commits September 29, 2025 18:39
…x missing closing {% endif %} in cheetah template.
… cheetah output automatically. Bump CrystFEL to 0.12.0
…d to reliably find install Python version even when a different Python is currently active via the user environment....
…odel. Also fix update_env/shell_source conflict.
@gadorlhiac gadorlhiac mentioned this pull request Nov 15, 2025
7 tasks
@gadorlhiac gadorlhiac changed the title ENH Cheetah-based SFX workflows and compression in Cheetah ENH Cheetah-based SFX workflows and compression in Cheetah and XPP preparation Nov 15, 2025
@gadorlhiac gadorlhiac marked this pull request as ready for review November 24, 2025 18:22
@gadorlhiac gadorlhiac merged commit a7bea20 into slac-lcls:dev Nov 24, 2025
@gadorlhiac gadorlhiac deleted the ENH/new_sfx branch November 24, 2025 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DOC Task related documentation for smalldata_tools and the XTC1-XTC2 conversion DOC Provide documentation on Maestro

2 participants