Support Dynamic Workflow Generation and Execution

## Dynamic Step Functions Workflow Generation for ML Pipelines

### Summary
Add optional Step Functions execution backend to the ML pipeline system. The DAG would be dynamically generated from `WORKBENCH_BATCH` configs in scripts - no static workflow definitions to maintain.

### Background
Currently, ML pipelines are executed via SQS FIFO + Batch `dependsOn`. This works well for dev/sandbox but lacks:
- Visual execution graph and history
- Approval gates for production deployments
- Built-in retry/error handling UI

The dependency graph logic already exists in `ml_pipelines_dt/handler.py` via `build_dependency_graph()`. This ticket adds an alternative execution backend that uses the same DAG.

### Proposed Changes

**1. Lambda handler changes** (`ml_pipelines_dt/handler.py`)
- Add `EXECUTION_MODE` environment variable (`sqs` | `step_functions`)
- New function `generate_step_functions_definition(scripts_with_config, root_map)` that builds state machine JSON from the DAG
- New function `submit_via_step_functions()` that creates/updates and executes the state machine

**2. Infrastructure (CDK)**
- IAM role for Step Functions to invoke Batch
- IAM permissions for Lambda to create/execute state machines
- Optional: S3 bucket for state machine definition storage

**3. State machine structure**

    Parallel
    ├── Chain 1: script_a → script_b → script_c
    ├── Chain 2: script_d → script_e
    └── Chain 3: script_f (standalone)

Each step invokes Batch job and waits for completion.

### Non-goals (for now)
- Approval gates (future enhancement)
- Conditional execution logic
- Cross-account workflows

### Trade-offs

| Current (SQS) | Step Functions |
|---------------|----------------|
| Simpler, fewer AWS resources | More infrastructure to manage |
| ~Free | ~$25/million state transitions |
| Logs scattered in CloudWatch | Unified execution history |
| No visual debugging | Nice graph UI |

### Testing
- Unit tests for `generate_step_functions_definition()`
- Integration test: run same pipeline set via both backends, verify identical results
- Test error handling: verify failed step stops downstream

### Estimate
Half-day to full-day including CDK changes and testing.

### Future enhancements
- Approval gates before production model promotion
- Slack/email notifications on completion/failure
- Cost tracking per workflow execution

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Dynamic Workflow Generation and Execution #558

Dynamic Step Functions Workflow Generation for ML Pipelines

Summary

Background

Proposed Changes

Non-goals (for now)

Trade-offs

Testing

Estimate

Future enhancements

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Current (SQS)	Step Functions
Simpler, fewer AWS resources	More infrastructure to manage
~Free	~$25/million state transitions
Logs scattered in CloudWatch	Unified execution history
No visual debugging	Nice graph UI

Support Dynamic Workflow Generation and Execution #558

Description

Dynamic Step Functions Workflow Generation for ML Pipelines

Summary

Background

Proposed Changes

Non-goals (for now)

Trade-offs

Testing

Estimate

Future enhancements

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions