Developer documentation

I had a burst of inspiration to write some dev documentation with the help of AI tools. This will be put in the official docs but quickly putting it here to solicit feedback. 

Also none of the links work because they are for readthedocs referencing.

---
# SeisFlows Developer Guide

A concise guide to the SeisFlows architecture for developers extending the package with new methods and modules. 

> **DISCLAIMER**: The original version of this guide was written by GPT-5 mini through VSCode with the prompt *"Can you please read through this entire package and write a succint readme file that explains the code architecture to a new developer that is interested in developing new methods for this package."* I have subsequently edited the document to fix errors, remove unncessary information, add in more relevant information, and have used additional prompts to get more detailed explanations for certain parts. I have reviewed all of the text in here personally. - Bryant

## Design Principle

SeisFlows is a Python-based, modular framework for seismic inversion workflows on high performance computers. The core design follows these principles:

- **Command Line Driven**: The `seisflows` command-line tool (in [seisflows.py](seisflows/seisflows.py)) is the main entry point that dispatches to a `SeisFlows` class with subcommands. Any user-facing interaction should go through this class.
- **Dynamic module loading**: To keep things flexible, all user interaction is through a dynamically created **YAML parameter file**. A runtime registry system loads the correct implementations based on the YAML configuration. 
- **Inheritance-based design**: Each major component (workflow, system, solver, preprocess, optimize) provides a standardized interface in a base class. 
    - Inheritance allows us to use generic calls throughout the codebase, like `solver.forward_simulation`, where `solver` can point to any number of specialized classes that have their own definition of `forward_simulation`. This allows us to provide a general framework that has plug-and-play extendability.
    - Inheritance also means that if higher level classes do not explicitely override required functions, then these functions will be defined by the base class by default. This allows for powerfully quick extension of the framework, at the cost of a slightly more obscured codebase.

### Core Dependencies

```bash
seisflows/
├── seisflows.py          # CLI entry point + SeisFlows class with subcommands
├── workflow/             # Main workflow orchestration (forward, migration, inversion, etc.)
├── system/               # HPC interaction (workstation, cluster, SLURM, etc.)
├── solver/               # External solver interface (SPECFEM2D/3D/3D_GLOBE)
├── preprocess/           # Seismic data I/O, misfit calculations, adjoint sources
├── optimize/             # Optimization algorithms (LBFGS, NLCG, Gradient)
├── plugins/              # Sub-modules to support main modules (line search, misfit, adjoint)
└── tools/                # Utility functions (config, I/O, signal processing, etc.)
```

Detailed file references:
- [seisflows.py](seisflows/seisflows.py) — CLI entry point
- [seisflows/__init__.py](seisflows/__init__.py) — Module registry
- [seisflows/workflow/](seisflows/workflow/) — Workflow implementations
- [seisflows/system/](seisflows/system/) — System abstractions
- [seisflows/solver/](seisflows/solver/) — Solver interfaces
- [seisflows/preprocess/](seisflows/preprocess/) — Preprocessing modules
- [seisflows/optimize/](seisflows/optimize/) — Optimization algorithms
- [seisflows/plugins/](seisflows/plugins/) — Plugin system
- [seisflows/tools/](seisflows/tools/) — Utility functions

## Key Concepts

### 1. Module Registry

The `NAMES` list in [seisflows/__init__.py](seisflows/__init__.py) defines the order-sensitive `module` registry:

```python
NAMES = ["workflow", "system", "solver", "preprocess", "optimize"]
```

Each module name maps to a directory within the package. Within each directory, **non-private `.py` files define concrete classes** that users can instantiate. For example:
- [solver/specfem2d.py](seisflows/solver/specfem2d.py) → class `Specfem2d` controls interface with SPECFEM2D solver
- [system/slurm.py](seisflows/system/slurm.py) → class Slurm controls interface with SLURM job scheduler
- [optimize/LBFGS.py](seisflows/optimize/LBFGS.py) → class LBFGS implements L-BFGS optimization algorithm

Each `module` has a base class which defines required behavior. All other classes in that module inherit and build off the base class.

### 2. Configuration via YAML parameter file + Dynamic Imports

Users create a template `parameters.yaml` file using the `seisflows init` command, and specify module choices:

```yaml
workflow: inversion
system: workstation
solver: specfem2d
preprocess: default
optimize: LBFGS
```

The `seisflows configure` command will load the chosen classes and also read module docstrings to auto-generate a full parameter file. That means that parameter files will change dynamically based on the main module choices (see: [seisflows.tools.config.custom_import()](seisflows/tools/config.py)). 

Chosen class are then loaded at runtime, which allows SeisFlows to be flexible such that users can select which module choice they want via the parameter file, knowing that all variables in their parameter file are important for their given problem.

> **NOTE**: Docstring convention, module docstrings must end with `***` to delimit "Parameters" (above) from "Paths" (below).

### 3. Job Execution on Different Systems

SeisFlows is designed to run on both personal workstations and large HPC clusters with the same interface. The execution model abstracts away system-specific details through the [system/](seisflows/system/) module. The idea being that you can prototype toy problems on your laptop, and then shift onto your HPC with a very similar interface:

**Main Job and System Submission**:
- When you run `seisflows submit`, the main Python job (i.e., main job, master job) is submitted to the chosen [system](seisflows/system/) via the system's job scheduler. The main job acts like a **daemon**, running in the background orchestrating tasks for other cores/nodes and therefore **must** be running for the entire workflow. If the main job fails, or is killed, the workflow cannot proceed.

    > **NOTE**: The main job should not have any compute intensive processes (just submitting jobs, moving files, simple math). At the moment I am not 100% sure if that is the case, some compute processes may be run on the main job for convenience. We should strive to separate those into compute jobs.

- For workstations ([workstation.py](seisflows/system/workstation.py)), the main job runs directly on your local machine.
- For HPC clusters ([cluster.py](seisflows/system/cluster.py), [slurm.py](seisflows/system/slurm.py), [lsf.py](seisflows/system/lsf.py), etc.), the main job can either:
  - Run on a **compute node** (default): The main job is submitted through the job scheduler as a regular job, then orchestrates work from there. Be sure to understand whether system walltimes restrict how long your job can be running.
  - Run on the **login node** (with `-d`/`--direct` flag): Useful for quick testing or when the main job's overhead is minimal, though not recommended for production runs due to login node load concerns. **I take no responsibility for how much your sys admin yells at you for running on the login node.**

**Delegating Work Through Job Scheduler**:
- Once the main Workflow job is running, it does **not** directly run expensive simulations or processing routines. Instead, it delegates work to compute nodes via the system module's `run()` method.
- The [system.run()](seisflows/system/cluster.py) method submits **array jobs** or **parallel tasks** through the job scheduler to execute things like forward and adjoint simulations, preprocessing tasks, or any other parallelizable work across multiple events or sources.
- Each system implementation (SLURM, LSF, Fujitsu, etc.) provides job scheduler-specific submit directives via the `submit_call_header` and `run_call_header` properties, ensuring jobs are submitted with correct account names, partitions, walltime, etc. This emulates users running job submission scripts (e.g., sbatch scripts) manually.
- Systems require specific commands or modifiers for running through the job scheduler, these can all be modified through the `parameters.yaml` file System section, or through the actual system sub-class.

**Communication and State Sharing**:
- The main job and worker tasks communicate **asynchronously** through shared filesystem state in the working directory (`scratch/`, `output/`, checkpoint files). 
- The main job monitors job status through queue commands (e.g., `squeue`), just like a user would. Jobs may be batched (e.g., only 5 concurrently running jobs at one time) by the main job at user request. Individual compute job failures will result in the main job exiting. 
- No direct inter-process communication (IPC) is needed; instead, tasks write results to disk and the main job reads them upon completion.
- This design enables robustness: if a worker task fails, the main job can detect it and retry or gracefully handle the error.

### 4. State Management & Checkpointing

SeisFlows establishes a known working directory structure: `scratch/`, `output/`, `logs/`
- `scratch/` stores temporary working files associated with the workflow. Most of the heavy lifting occurs here, however `scratch/` is subject to deletion during resets.
- `output/` stores results that should not be altered or deleted, such as updated models. 
- `logs/` stores log files both from the workflow and the job-scheduler.

Each module can maintain state via checkpoint files (e.g., `optimize` writes `path._checkpoint` as `.npz` files). This allows SeisFlows to recover from job or system crashes.

**Workflow State Tracking via `sfstate` File**:
- The main workflow state is tracked in the `sfstate` file (typically in the working directory root), which stores information about the current iteration, step count, and other workflow-level metadata. This file is updated as the workflow progresses through its task list.
- Upon job restart or recovery, SeisFlows reads the `sfstate` file to determine where to resume execution, allowing seamless recovery from interruptions without re-running completed tasks.

**Stopping and Resuming Workflows with `stop_after`**:
- The `stop_after` parameter allows users to halt the workflow after a specific task completes. This is useful for:
  - Testing workflows incrementally (run just the forward simulations, inspect results, then continue).
  - Debugging intermediate results without committing to a full inversion.
  - Gracefully pausing long-running inversions at natural checkpoints.
- Users can specify `stop_after` via the command line: `seisflows submit -s <TASK_NAME>` or in `parameters.yaml`.
- When a workflow stops at `stop_after`, the `sfstate` file is updated to record the stopping point. Users can later resume with `seisflows restart` or re-run `seisflows submit`, and the workflow will pick up from where it left off.
- Advanced users may manually edit the `sfstate` file in order to control specific behavior of the workflow.
- This design enables iterative debugging and development: make a change, run a subset of tasks, review results, then incrementally build toward a full solution.

**Interactive Debugging with `seisflows debug`**:
- The `seisflows debug` command starts an interactive Python environment with the entire workflow, all modules, and their state already loaded. This is invaluable for troubleshooting and manual workflow control.
- In debug mode, users have full programmatic access to:
  - The `workflow` object and its current state
  - All sub-module objects (`system`, `solver`, `preprocess`, `optimize`) with their configurations and internal state
  - Model files, kernels, and intermediate results via the [Model](seisflows/tools/specfem_model.py) class
  - Logging and utility functions
- Use cases include:
  - Manually inspecting intermediate results or state variables
  - Testing custom code snippets before integrating them into modules
  - Recovering from a failed workflow by manually calling specific methods
  - Validating that parameter choices are correct before running a full workflow
  - Modifying workflow behavior on-the-fly for experimentation
- Example debug session:
  ```python
  >>> from seisflows import SeisFlows
  >>> sf = SeisFlows()  # loads parameters.yaml and initializes all modules
  >>> sf.workflow.iteration
  2
  >>> sf.solver.materials
  ['vp', 'vs']
  >>> sf.optimize.step_count
  3
  >>> # manually call a method
  >>> sf.solver.generate_synthetic()
  ```
- Debug mode gives developers and advanced users full control to inspect, validate, and even manually drive the workflow when automatic execution is not feasible or when deep debugging is needed.

### 5. Logging

SeisFlows has a significant logging system. Each of the modules and submodules has access to the main logger, while individual copmute jobs, and even individual python processes, may create their own specific logs. Logs are created in multiple location depending on the process that creates it.

- `sflog.txt`: Main log file controlled by the Workflow. High level log messages such as the status of the workflow, and inversion information are printed here
- `logs/*`: Stores individual job logs, e.g., from the job scheduler. These are used to debug and troubleshoot when individual processes fail, which may or may not lead to main job failure.
- `scratch/*`: Each of the modules may store their own logs for individual processes. For example SPECFEM logs are stored within 
their respective directories in `scratch/solver/<event_id>/*.txt`, whereas preprocessing logs are stored in `scratch/preprocess/logs`. These logs are much more granular but may provide important information for debugging.

## Module Deep Dives

**Overview**: Each module class defines a set of required **parameters** and **paths** that feed into the `parameters.yaml` configuration file.  When users run `seisflows configure`, these docstrings are introspected to auto-generate the full parameter file with type hints and defaults. 

Every module requires a `check()` and `setup()` function, these will be called during runtime.
- `check()` function ensures that parameters match expected values from the class, and that paths exist and point to appropriate files like data or models. This method uses **private variables** (attributes prefixed with `_`) to enforce internal requirements and invariants. These private checks provide guardrails that prevent invalid configurations from causing downstream failures.
- `setup()` function runs any logistical tasks required for the module to operate. This may include creating and populating output directories, or establishing and/or loading checkpoint files. These are not included in the `__init__()` function because other tasks like `seisflows debug` will instantiate the classes but often do not need to run setup functions.
- `finalize()` some modules have a `finalize` command that is run at the end of an inversion iteration. These are for tear-down tasks used to reset for the following iteration (e.g., deleting scratch files, saving output files).

### Workflow (`workflow/`)

Orchestrates the overall execution loop. Start-to-finish workflows. Whenever you see a flowchart diagram of full waveform inversion, that's what this is:
- [**Forward**](seisflows/workflow/forward.py): Run forward simulations and misfit calculation (optional) only. Useful for en-masse forward simulations.
- [**Migration**](seisflows/workflow/migration.py): Backproject kernels from adjoint simulations. Create sensitivity kernels  
- [**Inversion**](seisflows/workflow/inversion.py): Iterative optimization loop (forward → misfit → adjoint → gradient → line search → repeat).
- [**Noise Inversion**](seisflows/workflow/noise_inversion.py): Specialized inversion for ambient noise adjoint tomography (work in progress)

**Key interface**:
- `task_list` property: returns list of methods to execute in order which compromises the workflow.
- `run()`: executed by the `submit` command; orchestrates the task loop and runs through the `task_list` with checkpointing.
- The workflow has access to all other modules through internal attributes, `self.system`, `self.solver`, `self.preprocess`, `self.optimize` allowing for cross-module calls.

### System (`system/`)

Abstracts job submission and execution on diverse HPC environments.

**Inheritance chain**: [Workstation](seisflows/system/workstation.py) (base, serial/MPI on login node) → [Cluster](seisflows/system/cluster.py) (generic HPC) → [Slurm](seisflows/system/slurm.py), [Lsf](seisflows/system/lsf.py), etc.

**Key interface**:
- `submit(workdir, parameter_file)`: start the main job on the system of choice, may be as simple as running `workflow.run()` or as complicated as requesting a compute node and then automagically running SeisFlows on the compute node.
- `submit_call_header` (property): defines scheduler-specific submit directives (e.g., `#SBATCH` lines in SLURM). These are defined in the system sub-modules and the parameter file.
- `run_call_header` (property): same as `submit_call_header` but for compute jobs. These may be different as they will be asking for different numbers of processors/nodes etc.
- `run(classname, method, hosts=None)`: execute a class method (i.e., `workflow.run_forward_simulations()`) `ntask` times on compute node(s). Machinery to monitor the job queue and wait until all `ntask` processes are finished before releasing control back to main job. 

### Solver (`solver/`)

Interfaces with external numerical solvers (SPECFEM2D/3D/3D_GLOBE). 

> **NOTE:** In the future, interfacing with other numerical solvers will require creating new base classes and generalizing the way the `solver` module interacts with the rest of the package. The abstraction should make it easier but given that SPECFEM is the only solver used with SeisFlows, there may be some inbuilt paradigms focused around SPECFEM that make this more difficult.

**Inheritance chain**: [Specfem](seisflows/solver/specfem.py) (base, generalized SPECFEM interface) → [Specfem2d](seisflows/solver/specfem2d.py), [Specfem3d](seisflows/solver/specfem3d.py), [Specfem3dGlobe](seisflows/solver/specfem3d_globe.py).

**Key methods**:
- `forward_simulation()`: run a forward simulation.
- `adjoint_simulatiuon()`: run an adjoint simulation
- There are many other functions within the `solver` module used to interact with data, models, executables and parameter files of SPECFEM. Please look at the base class for more information.
- The [Model](seisflows/tools/specfem_model.py) class handles parallelized manipulation of SPECFEM FORTRAN binary models.

### Preprocess (`preprocess/`)

Handles seismic data (observed & synthetic) and adjoint source generation.

**Inheritance chain**: [Default](seisflows/preprocess/default.py) (base, general-purpose preprocessing), [Pyaflowa](seisflows/preprocess/pyaflowa.py) (base, interfaces with the external Pyatoa package which has capabilities for windowing and more advanced interaction with misfit results).

**Key methods**:
- `quantify_misfit(...)`: compute misfit between obs/syn, write adjoint sources.
- Preprocess reads/writes via ObsPy (`SU`, `ASCII`, `SAC` formats).

**Plugin system**: Misfit and adjoint source functions are loaded from [seisflows/plugins/preprocess/](seisflows/plugins/preprocess/).

### Optimize (`optimize/`)

Nonlinear optimization algorithms for model updates.

**Inheritance chain**: [Gradient](seisflows/optimize/gradient.py) (base, steepest descent) → [LBFGS](seisflows/optimize/LBFGS.py) (limited-memory BFGS, forced Backtrack line search) → other variants.

**Key interface**:
- `compute_direction()`: compute search direction (gradient or quasi-Newton approximation).
- `initialize_line_search()`, `evaluate_line_search_*()`, `update_line_search()`: coordinates line search.
- Has its own internal checkpointing system for restarting failed line searches.
- Reads/writes model and gradient vectors via [Model](seisflows/tools/specfem_model.py) class 

## How It All Connects

- **Workflow** drives the top-level loop and orchestrates module calls.
- **System** executes remote tasks via `system.run()` on compute nodes.
- **Solver** manages interaction with numerical solver.
- **Preprocess** manages time series data, quantifies misfit.
- **Optimize** manipulates kernels, gradients and manages the line search.

### Typical Inversion Workflow

1. **User** creates `parameters.yaml` with `seisflows init` and selects module implementations.
2. **`seisflows configure`** scans docstrings, fills defaults, expands paths, creates full parameter file.
3. User modfies parameter file as required. 
    > NOTE: Development of a specific system sub-class may be required here for systems encountered for the first time.
3. **`seisflows submit`** calls `SeisFlows.submit()`:
   - Validates parameters, creates working directory structure.
   - Calls `system.submit()` → submits `workflow.main()` to the chosen system.
   - Main job is established on the system which takes over control.
4. **`workflow.main()`** loops over `task_list` (each iteration). 
    One example of this is an inversion workflow, imagine starting with model `M_i`.
   1. `generate_synthetic_data`: Optional, if synthetic-synthetic inversion, creates "data" from target model `M_true`
   1. `evaluate_initial_misfit`: Run forward simulation through model `M_i` to generate synthetics, quantify misfit to create adjoint sources
   1. `run_adjoint_simulations`: Run adjoint simulations to create misfit kernels
   1. `postprocess_event_kernels`: Perform any kernel processing like preconditioning, masking
   1. `evaluate_gradient_from_kernels`: Generate gradient `G_i` used to update starting model `M_i`
   1. `initialize_line_search`: Generate `M_trial` by updating `M_i` with `G_i`
   1. `evaluate_line_search_misfit`: Run forward simulations through `M_trial` and quantify misfit
   1. `update_line_search`: Determine if misfit reduces, continue line search until it does
   1. `finalize_iteration`: Select final model `M_i+1`, restart from (2)
5. **Output**: Throughout the workflow, outputs like updated models, kernels, gradients etc. written to `output/`.


## Implementing New Methods

New methods must follow template structures, and should usually build on top of existing base class or higher. Developers
should `find all` throughout the repository to figure out how these modules are called by the remainder of the code. 
For example if you want to develop a new `Optimization` algorithm, it is suggested you inherit from the base class `Gradient`.

### Example: Adding a New Optimization Algorithm

1. **Create** `seisflows/optimize/new_algorithm.py`:
   ```python
   from seisflows.optimize.gradient import Gradient
   
   class NewAlgorithm(Gradient):
       """NewAlgorithm [Optimize]
       ----------------------
       My custom optimization method.
   
       Parameters
       ----------
       :type my_param: float
       :param my_param: A custom parameter
   
       Paths
       -----
       ***
       """
       def __init__(self, my_param=1.0, **kwargs):
           super().__init__(**kwargs)
           self.my_param = my_param
   
       def compute_direction(self):
           # Implement your direction computation
           pass
   ```

2. **User** sets the following in `parameters.yaml`
    ```yaml
    optimize: new_algorithm
    ``` 
3. **`seisflows configure`** auto-discovers your docstring and parameters from `__init__` and populates `parameters.yaml`
4. **`seisflows run`** starts main job and **Workflow** instantiates and calls your methods through `optimize.compute_direction()`

### Adding a New Misfit Function

1. **Create** `seisflows/plugins/preprocess/misfit/my_misfit.py`:
   ```python
   def my_misfit(obs, syn):
       """Custom misfit function."""
       return ((obs - syn) ** 2).sum()
   ```

2. **User** sets `misfit: my_misfit` in `parameters.yaml`.
3. **Preprocess** loads and calls your function dynamically.

### Adding a New Base Solver Interface

1. **Create** `seisflows/solver/my_solver.py`:
   ```python
   from seisflows.solver.specfem import Specfem
   
   class MySolver():
       """MySolver [Solver]
       ------------------
       Interface to my custom numerical solver.
   
       Parameters
       ----------
       :type solver_param: str
       :param solver_param: Solver-specific parameter
   
       Paths
       -----
       ***
       """
       def generate_synthetic(self):
           # Your solver-specific forward simulation logic
           pass

       def forward_simulation(self):
           # Calls to the external solver to run a forward simulation
           pass
   
       def adjoint_simulation(self):
           # Calls to the external solver to run an adjoint simulation
           pass
   ```

2. Ensure your solver writes output in formats recognized by `preprocess` and `optimize` (model vectors, kernels, traces).
3. Check through the code to see where other modules call `solver.<function>()` and ensure that your solver can reproduce the necessary outputs.

## Important Conventions

- **Parameter names in YAML**: Use **UPPER_CASE** for `seisflows par` lookups. Paths are prefixed with `path_` (e.g., `path_output`).
- **Absolute vs. relative paths**: `seisflows configure -a` sets absolute paths; default is relative to `cwd`.
- **Docstring format**: Terminate docstrings with `***` after "Parameters" section to separate "Paths".
- **Logging**: Use `from seisflows import logger; logger.info(...)` for consistent output.

## Testing

Tests are in [seisflows/tests/](seisflows/tests/). Run them with:
```bash
cd seisflows/tests && pytest
```

Key test files:
- [test_seisflows.py](seisflows/tests/test_seisflows.py): CLI and core workflows.
- [test_optimize.py](seisflows/tests/test_optimize.py): Optimization algorithms.
- [test_preprocess.py](seisflows/tests/test_preprocess.py): Data processing.
- [test_solver.py](seisflows/tests/test_solver.py): Solver interfaces.

## Resources

- **Docs**: [docs/](docs/) (Sphinx). Key files: [overview.rst](docs/overview.rst), [getting_started.rst](docs/getting_started.rst), [extending.rst](docs/extending.rst).
- **Examples**: [seisflows/examples/](seisflows/examples/) (runnable with `seisflows examples run <N>`).
- **API docs**: Generated from docstrings; see [docs/conf.py](docs/conf.py) for Sphinx config.

---

**Questions?** Open a [GitHub Issue](https://github.com/adjtomo/seisflows/issues) or start a [discussion](https://github.com/orgs/adjtomo/discussions).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Developer documentation #263

SeisFlows Developer Guide

Design Principle

Core Dependencies

Key Concepts

1. Module Registry

2. Configuration via YAML parameter file + Dynamic Imports

3. Job Execution on Different Systems

4. State Management & Checkpointing

5. Logging

Module Deep Dives

Workflow (`workflow/`)

System (`system/`)

Solver (`solver/`)

Preprocess (`preprocess/`)

Optimize (`optimize/`)

How It All Connects

Typical Inversion Workflow

Implementing New Methods

Example: Adding a New Optimization Algorithm

Adding a New Misfit Function

Adding a New Base Solver Interface

Important Conventions

Testing

Resources

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Developer documentation #263

Description

SeisFlows Developer Guide

Design Principle

Core Dependencies

Key Concepts

1. Module Registry

2. Configuration via YAML parameter file + Dynamic Imports

3. Job Execution on Different Systems

4. State Management & Checkpointing

5. Logging

Module Deep Dives

Workflow (workflow/)

System (system/)

Solver (solver/)

Preprocess (preprocess/)

Optimize (optimize/)

How It All Connects

Typical Inversion Workflow

Implementing New Methods

Example: Adding a New Optimization Algorithm

Adding a New Misfit Function

Adding a New Base Solver Interface

Important Conventions

Testing

Resources

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Workflow (`workflow/`)

System (`system/`)

Solver (`solver/`)

Preprocess (`preprocess/`)

Optimize (`optimize/`)