Skip to content

Conversation

@robfalck
Copy link
Contributor

Purpose

The MDO community at NASA use XDSMs pretty extensively to convey ideas, but we need a better way to turn OpenMDAO models in to XDSM diagrams. The intent of this update to pyXDSM is to allow it to ingest XDSM information in a declarative format to then produce the XDSM in PDF format. This means that tools like OpenMDAO would only need to be able to generate a compatible JSON file rather than interacting directly via the API.

This PR does the following:

  • Changes setup.py to pyproject.toml for PEP518 compatibility

  • Objects in the API are now built on top of Pydantic BaseModel

In particular, this leverages Pydantic for serialization/deserialization and validation.
This means, in addition to the existing imperative API, an XDSM model can now be created with a declarative syntax via a dictionary or JSON (or any other serialization that can be represented as a dictionary, such as yaml.)

The key methods here are .to_json() and .from_json().

The MDF and kitchen sink examples now have corresponding .json files. Tests have been added to validate the serialization and deserialization of XDSM objects.

  • A __main__.py has been added to provide a command-line interface.

This allows the user to quickly build a tikz, pdf, or json output file from a given json file as input. The JSON output should be the equivalent of a copy but it seemed appropriate to include it, in case we ever support any other type of file format (yaml, toml, etc).

This utility allows the user to specify the JSON file to be converted. If no output is specified, it will assume that a PDF is to be generated. It supports a --cleanup and --quiet options of the build method as arguments.

  • The latex generation of XDSM was moved to a separate xdsm_latex_writer.py file.

  • Similar changes are not yet made to matrix equations, though I plan to work on that as a follow-up.

  • Docs have been updated.

Expected time until merged

This is not urgent. While I'm going to base some future work on this change, its a somewhat large PR and I understand if it takes some time to fully review it. I can work to my fork in the mean time.

Type of change

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (non-backwards-compatible fix or feature)
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no API changes)
  • Documentation update
  • Maintenance update
  • Other (please describe)

Testing

I've added tests to test_xdsm.py that test that a deserialized XDSM is equivalent.
The example XDSMs have been added in JSON format and tests of the CLI have been implemented.

Checklist

  • I have run ruff check and ruff format to make sure the Python code adheres to PEP-8 and is consistently formatted
  • I have formatted the Fortran code with fprettify or C/C++ code with clang-format as applicable
  • I have run unit and regression tests which pass locally with my changes
  • I have added new tests that prove my fix is effective or that my feature works
  • I have added necessary documentation

@robfalck robfalck requested a review from a team as a code owner October 24, 2025 12:17
@robfalck robfalck requested a review from A-CGray October 24, 2025 12:17
@ewu63
Copy link
Collaborator

ewu63 commented Oct 24, 2025

This is pretty great, thanks for putting it together Rob! I think it has been a longstanding goal to have a stable declarative syntax for XDSM diagrams, and it's a major shortcoming of the current Python API. It will take me some time to review this PR, but I do want to pose a question initially -- how do we feel about a Pydantic-based JSON representation vs. something a bit more custom? I understand that from a ease of development perspective Pydantic is great and very stable, but this effectively locks in the Pydantic class definitions --- any future changes such as adding/removing/renaming fields will mean that the previously-generated JSON files are invalid, and shims (likely in the form of model_validators) have to be added to maintain compatibility.

On the other hand, a custom format may be more compact/readable, and we could have a bit more flexibility in serialization/deserialization, though we lose out on all the great Pydantic features. And I suppose some compat layer has to be created no matter what we do.. just curious for other's thoughts here.

Lastly, I want to mention that there exists XDSMjs as a javascript library, which of course has its own existing JSON representation of the XDSM diagram. Maybe there are opportunities to standardize and define a community-driven JSON schema that could be used by various tools/backends. CC @relf @eirikurj and others, if there are lots of discussions we can move this to a dedicated thread.

@ewu63 ewu63 self-requested a review October 24, 2025 18:07
@robfalck
Copy link
Contributor Author

I've been diving into pydantic during this government shutdown. It seems like it has fairly broad adoption so I'm not concerned with it losing support.

I think the benefits it provides as far as providing serialization and validation are really worthwhile vs a custom format. As far as merging with XDSMjs, that would be a discussion worth having.

@ewu63
Copy link
Collaborator

ewu63 commented Nov 13, 2025

Sorry, the concern I had is not about pydantic as a package, but the particular schema in this repo gets locked in via any serialized JSON files, and changes to those pydantic class definitions will not be backwards compatible. It's something we can get around with, via validators and such, but it is cumbersome.

I'll try to review the actual code, and I don't have any fundamental objections to using the class definitions as-is but I was just wondering if there are any improvements we want to do---best to do them now before they are fixed by the JSON.

import subprocess
from collections import namedtuple
import numpy as np
from typing import Dict, List, Optional, Tuple, Union
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dict, List, and Tuple are deprecated since python 3.9 in favour of builtins dict etc. Unions can also be written via | starting with 3.10.

class TotalJacobian(BaseModel):
"""Total Jacobian matrix representation."""

variables: Dict[str, Variable] = Field(default_factory=dict)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default_factory is not needed for pydantic models, it will automatically do a deep copy on initialization. Simply writing variables: dict[str, Variable] = {} will work


variables: Dict[str, Variable] = Field(default_factory=dict)
j_inputs: Dict[int, Variable] = Field(default_factory=dict)
n_inputs: int = Field(default=0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 99% of the cases the Field constructor is not needed, you can simply put n_inputs: int = 0. The description (if you'd like) can go as a normal Python docstring with triple quotes just below the attribute definition.

setup: bool = Field(default=False)

self._setup = False
model_config = ConfigDict(arbitrary_types_allowed=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this?

Copy link
Collaborator

@ewu63 ewu63 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a bunch of comments. I'd also be happy to help clean up / push to this PR if you'd like


self._connections = {}
self._ij_connections = {}
setup: bool = Field(default=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that this is an internal variable not meant to be read/modified by the user or saved to the JSON. If so, perhaps we can keep it as _setup so 1) it does not get serialized, and 2) indicates to the user that they should not interact with this private variable

terms: List[str] = Field(default_factory=list)

self._terms = []
model_config = ConfigDict(arbitrary_types_allowed=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly do we need this?

total_size: int = Field(default=0)

self._total_size = 0
setup: bool = Field(default=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment on setup

pyxdsm/XDSM.py Outdated
model_config = ConfigDict(arbitrary_types_allowed=True)

@field_validator("side")
@classmethod
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed since you specified side must be of type Side which is a literal with two options

pyxdsm/XDSM.py Outdated
AutoFadeOption = Literal["all", "connected", "none", "incoming", "outgoing"]

# Valid TikZ node styles (from diagram_styles.tikzstyles)
VALID_NODE_STYLES = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a similar Literal definition? That way it can be used in the type definition below

pyxdsm/XDSM.py Outdated

model_config = ConfigDict(arbitrary_types_allowed=True)

def __init__(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using __init__ here feels a little odd. I think implementing a model_validator is a more appropriate pattern if the goal is to just patch in data which is not stored in a JSON?

pyxdsm/XDSM.py Outdated
label_width: Optional[int] = None,
spec_name: Optional[str] = None,
) -> None:
"""Add a system block on the diagonal."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add back the appropriate Parameters section of the docstrings here and below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've cleaned this up a bit.

I removed any changes involving MatrixEquation/Jacobian because I wasn't really interested in having that be serializable with this PR.

I've added the original docstrings back in place.

The write_sys_specs is the original specs method, I left it in place with the to_json and from_json methods being the pydantic interface. Maybe there's not longer a reason for write_sys_specs but I didn't want to break any existing implementations.

json_str = json.dumps(spec, indent=2)
f.write(json_str)

def to_json(self, filename: Optional[str] = None) -> str:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the older write_sys_specs is now deprecated? Do we want to make that clear to users? The docstrings here say "specification" which may be a little confusing with the other function name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think write_sys_specs will get much use if its mostly/completely duplicated by to_json/from_json.

I can add the deprecation warning if you like.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I diffed this file with what you removed in XDSM.py, and saw that a lot of comments were removed for some reason. Can you add those back? There were lots of helpful comments previously

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, was there any particular reason for moving this? I understand that logically it could probably be a separate class, but the diffs are really hard to verify given the scope of this PR, so I wonder if we can do that in a subsequent PR to make review easier.

@robfalck
Copy link
Contributor Author

robfalck commented Nov 13, 2025

Since there's interest in this, I'll clean this up when I get a chance. With the furlough ending things will be a bit busy for a few days :)

@ewu63 ewu63 mentioned this pull request Nov 16, 2025
13 tasks
@robfalck
Copy link
Contributor Author

robfalck commented Dec 12, 2025

I think I've addressed all of the comments, XDSM is considerably simpler now.

Is the failing doc build a permissions issue that's not expected to pass?

@robfalck robfalck requested a review from ewu63 December 12, 2025 14:32
@A-CGray
Copy link
Member

A-CGray commented Dec 12, 2025

Is the failing doc build a permissions issue that's not expected to pass?

Looks like your changes add a new dependency to the docs build?

    Traceback (most recent call last):
      File "/home/docs/checkouts/readthedocs.org/user_builds/mdolab-pyxdsm/envs/61/lib/python3.11/site-packages/sphinx/registry.py", line 541, in load_extension
        mod = import_module(extname)
              ^^^^^^^^^^^^^^^^^^^^^^
      File "/home/docs/.asdf/installs/python/3.11.12/lib/python3.11/importlib/__init__.py", line 126, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
      File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
      File "<frozen importlib._bootstrap>", line 1140, in _find_and_load_unlocked
    ModuleNotFoundError: No module named 'sphinxcontrib.autodoc_pydantic'

There's a requirements.txt in the doc directory you should be able to add this to

@robfalck
Copy link
Contributor Author

I'm adding the autodoc_pydantic requirement, but the preference for terse pydantic specifications (no Fields with descriptions) might make this less useful.

I've debated whether documentation of the pydantic models is necessary, but I'm erring on the side of documenting more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants