Added pydantic backend for serialization, and other updates. #61

robfalck · 2025-10-24T12:17:50Z

Purpose

The MDO community at NASA use XDSMs pretty extensively to convey ideas, but we need a better way to turn OpenMDAO models in to XDSM diagrams. The intent of this update to pyXDSM is to allow it to ingest XDSM information in a declarative format to then produce the XDSM in PDF format. This means that tools like OpenMDAO would only need to be able to generate a compatible JSON file rather than interacting directly via the API.

This PR does the following:

Changes setup.py to pyproject.toml for PEP518 compatibility
Objects in the API are now built on top of Pydantic BaseModel

In particular, this leverages Pydantic for serialization/deserialization and validation.
This means, in addition to the existing imperative API, an XDSM model can now be created with a declarative syntax via a dictionary or JSON (or any other serialization that can be represented as a dictionary, such as yaml.)

The key methods here are .to_json() and .from_json().

The MDF and kitchen sink examples now have corresponding .json files. Tests have been added to validate the serialization and deserialization of XDSM objects.

A __main__.py has been added to provide a command-line interface.

This allows the user to quickly build a tikz, pdf, or json output file from a given json file as input. The JSON output should be the equivalent of a copy but it seemed appropriate to include it, in case we ever support any other type of file format (yaml, toml, etc).

This utility allows the user to specify the JSON file to be converted. If no output is specified, it will assume that a PDF is to be generated. It supports a --cleanup and --quiet options of the build method as arguments.

The latex generation of XDSM was moved to a separate xdsm_latex_writer.py file.
Similar changes are not yet made to matrix equations, though I plan to work on that as a follow-up.
Docs have been updated.

Expected time until merged

This is not urgent. While I'm going to base some future work on this change, its a somewhat large PR and I understand if it takes some time to fully review it. I can work to my fork in the mean time.

Type of change

Bugfix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (non-backwards-compatible fix or feature)
Code style update (formatting, renaming)
Refactoring (no functional changes, no API changes)
Documentation update
Maintenance update
Other (please describe)

Testing

I've added tests to test_xdsm.py that test that a deserialized XDSM is equivalent.
The example XDSMs have been added in JSON format and tests of the CLI have been implemented.

Checklist

I have run ruff check and ruff format to make sure the Python code adheres to PEP-8 and is consistently formatted
I have formatted the Fortran code with fprettify or C/C++ code with clang-format as applicable
I have run unit and regression tests which pass locally with my changes
I have added new tests that prove my fix is effective or that my feature works
I have added necessary documentation

ewu63 · 2025-10-24T18:06:54Z

This is pretty great, thanks for putting it together Rob! I think it has been a longstanding goal to have a stable declarative syntax for XDSM diagrams, and it's a major shortcoming of the current Python API. It will take me some time to review this PR, but I do want to pose a question initially -- how do we feel about a Pydantic-based JSON representation vs. something a bit more custom? I understand that from a ease of development perspective Pydantic is great and very stable, but this effectively locks in the Pydantic class definitions --- any future changes such as adding/removing/renaming fields will mean that the previously-generated JSON files are invalid, and shims (likely in the form of model_validators) have to be added to maintain compatibility.

On the other hand, a custom format may be more compact/readable, and we could have a bit more flexibility in serialization/deserialization, though we lose out on all the great Pydantic features. And I suppose some compat layer has to be created no matter what we do.. just curious for other's thoughts here.

Lastly, I want to mention that there exists XDSMjs as a javascript library, which of course has its own existing JSON representation of the XDSM diagram. Maybe there are opportunities to standardize and define a community-driven JSON schema that could be used by various tools/backends. CC @relf @eirikurj and others, if there are lots of discussions we can move this to a dedicated thread.

robfalck · 2025-10-24T18:12:35Z

I've been diving into pydantic during this government shutdown. It seems like it has fairly broad adoption so I'm not concerned with it losing support.

I think the benefits it provides as far as providing serialization and validation are really worthwhile vs a custom format. As far as merging with XDSMjs, that would be a discussion worth having.

ewu63 · 2025-11-13T06:30:14Z

Sorry, the concern I had is not about pydantic as a package, but the particular schema in this repo gets locked in via any serialized JSON files, and changes to those pydantic class definitions will not be backwards compatible. It's something we can get around with, via validators and such, but it is cumbersome.

I'll try to review the actual code, and I don't have any fundamental objections to using the class definitions as-is but I was just wondering if there are any improvements we want to do---best to do them now before they are fixed by the JSON.

ewu63 · 2025-11-13T06:33:01Z

pyxdsm/matrix_eqn.py

 import subprocess
-from collections import namedtuple
-import numpy as np
+from typing import Dict, List, Optional, Tuple, Union


Dict, List, and Tuple are deprecated since python 3.9 in favour of builtins dict etc. Unions can also be written via | starting with 3.10.

ewu63 · 2025-11-13T06:34:34Z

pyxdsm/matrix_eqn.py

+class TotalJacobian(BaseModel):
+    """Total Jacobian matrix representation."""
+
+    variables: Dict[str, Variable] = Field(default_factory=dict)


default_factory is not needed for pydantic models, it will automatically do a deep copy on initialization. Simply writing variables: dict[str, Variable] = {} will work

ewu63 · 2025-11-13T06:35:22Z

pyxdsm/matrix_eqn.py

+
+    variables: Dict[str, Variable] = Field(default_factory=dict)
+    j_inputs: Dict[int, Variable] = Field(default_factory=dict)
+    n_inputs: int = Field(default=0)


In 99% of the cases the Field constructor is not needed, you can simply put n_inputs: int = 0. The description (if you'd like) can go as a normal Python docstring with triple quotes just below the attribute definition.

ewu63 · 2025-11-13T06:35:47Z

pyxdsm/matrix_eqn.py

+    setup: bool = Field(default=False)

-        self._setup = False
+    model_config = ConfigDict(arbitrary_types_allowed=True)


Why do we need this?

ewu63

Left a bunch of comments. I'd also be happy to help clean up / push to this PR if you'd like

ewu63 · 2025-11-13T06:39:27Z

pyxdsm/matrix_eqn.py


-        self._connections = {}
-        self._ij_connections = {}
+    setup: bool = Field(default=False)


Seems that this is an internal variable not meant to be read/modified by the user or saved to the JSON. If so, perhaps we can keep it as _setup so 1) it does not get serialized, and 2) indicates to the user that they should not interact with this private variable

ewu63 · 2025-11-13T06:40:26Z

pyxdsm/matrix_eqn.py

+    terms: List[str] = Field(default_factory=list)

-        self._terms = []
+    model_config = ConfigDict(arbitrary_types_allowed=True)


Similarly do we need this?

ewu63 · 2025-11-13T06:40:40Z

pyxdsm/matrix_eqn.py

+    total_size: int = Field(default=0)

-        self._total_size = 0
+    setup: bool = Field(default=False)


Same comment on setup

ewu63 · 2025-11-13T06:44:59Z

pyxdsm/XDSM.py

+    model_config = ConfigDict(arbitrary_types_allowed=True)
+
+    @field_validator("side")
+    @classmethod


Not needed since you specified side must be of type Side which is a literal with two options

ewu63 · 2025-11-13T06:46:16Z

pyxdsm/XDSM.py

+AutoFadeOption = Literal["all", "connected", "none", "incoming", "outgoing"]
+
+# Valid TikZ node styles (from diagram_styles.tikzstyles)
+VALID_NODE_STYLES = {


Should this be a similar Literal definition? That way it can be used in the type definition below

ewu63 · 2025-11-13T06:51:45Z

pyxdsm/XDSM.py

+
+    model_config = ConfigDict(arbitrary_types_allowed=True)
+
+    def __init__(


Using __init__ here feels a little odd. I think implementing a model_validator is a more appropriate pattern if the goal is to just patch in data which is not stored in a JSON?

pyxdsm/XDSM.py

ewu63 · 2025-11-13T06:54:09Z

pyxdsm/XDSM.py

+        label_width: Optional[int] = None,
+        spec_name: Optional[str] = None,
+    ) -> None:
+        """Add a system block on the diagonal."""


Add back the appropriate Parameters section of the docstrings here and below

I've cleaned this up a bit.

I removed any changes involving MatrixEquation/Jacobian because I wasn't really interested in having that be serializable with this PR.

I've added the original docstrings back in place.

The write_sys_specs is the original specs method, I left it in place with the to_json and from_json methods being the pydantic interface. Maybe there's not longer a reason for write_sys_specs but I didn't want to break any existing implementations.

ewu63 · 2025-11-13T06:59:13Z

pyxdsm/XDSM.py

                    json_str = json.dumps(spec, indent=2)
                    f.write(json_str)
+
+    def to_json(self, filename: Optional[str] = None) -> str:


I guess the older write_sys_specs is now deprecated? Do we want to make that clear to users? The docstrings here say "specification" which may be a little confusing with the other function name

I don't think write_sys_specs will get much use if its mostly/completely duplicated by to_json/from_json.

I can add the deprecation warning if you like.

ewu63 · 2025-11-13T07:04:30Z

pyxdsm/xdsm_latex_writer.py

I diffed this file with what you removed in XDSM.py, and saw that a lot of comments were removed for some reason. Can you add those back? There were lots of helpful comments previously

Actually, was there any particular reason for moving this? I understand that logically it could probably be a separate class, but the diffs are really hard to verify given the scope of this PR, so I wonder if we can do that in a subsequent PR to make review easier.

robfalck · 2025-11-13T14:24:54Z

Since there's interest in this, I'll clean this up when I get a chance. With the furlough ending things will be a bit busy for a few days :)

Revert matrix_eqn.py to non-pydantic version for now.

…ings or filenames.

robfalck · 2025-12-12T14:32:16Z

I think I've addressed all of the comments, XDSM is considerably simpler now.

Is the failing doc build a permissions issue that's not expected to pass?

A-CGray · 2025-12-12T21:03:46Z

Is the failing doc build a permissions issue that's not expected to pass?

Looks like your changes add a new dependency to the docs build?

    Traceback (most recent call last):
      File "/home/docs/checkouts/readthedocs.org/user_builds/mdolab-pyxdsm/envs/61/lib/python3.11/site-packages/sphinx/registry.py", line 541, in load_extension
        mod = import_module(extname)
              ^^^^^^^^^^^^^^^^^^^^^^
      File "/home/docs/.asdf/installs/python/3.11.12/lib/python3.11/importlib/__init__.py", line 126, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
      File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
      File "<frozen importlib._bootstrap>", line 1140, in _find_and_load_unlocked
    ModuleNotFoundError: No module named 'sphinxcontrib.autodoc_pydantic'

There's a requirements.txt in the doc directory you should be able to add this to

robfalck · 2025-12-23T15:49:25Z

I'm adding the autodoc_pydantic requirement, but the preference for terse pydantic specifications (no Fields with descriptions) might make this less useful.

I've debated whether documentation of the pydantic models is necessary, but I'm erring on the side of documenting more.

robfalck added 12 commits October 2, 2025 19:56

converting XDSM to a Pydantic BaseModel.

f0128a2

Separated latex writer.

37c58f0

util.py

6bdb751

added __main__ with exporting of json to other formats.

eb2a89e

removed notional html output

41b2b7a

removed deprecated numpy distutils.

f8021da

Docs and pyproject.toml

5b1f950

More documentation, tests, and some cleanup.

1208d39

ruff passes with a few ignores for string formatting.

0db435a

test of example json files. cli cleanup.

eca0d66

ruff check passing

795f6f4

ruff format

3bf2caf

robfalck requested a review from a team as a code owner October 24, 2025 12:17

robfalck requested a review from A-CGray October 24, 2025 12:17

ewu63 self-requested a review October 24, 2025 18:07

ewu63 reviewed Nov 13, 2025

View reviewed changes

cleanup of MatrixEqn and Jacobian schema

24d183e

ewu63 mentioned this pull request Nov 16, 2025

Add stubs for XDSM #59

Open

13 tasks

robfalck added 4 commits December 1, 2025 06:51

Update JSON output to add terminal line ending.

fa4c408

Revert matrix_eqn.py to non-pydantic version for now.

Restored docstrings/comments in XDSM.

de0b50a

ruff fixes for matrix_eqn

dbc5e18

Better docstrings for to_json/from_json. from_json can now handle str…

1a64455

…ings or filenames.

robfalck added 4 commits December 2, 2025 09:54

lint

004f009

cleanup based on feedback from ewu63

6d6e098

Removed unneeded __init__. Removed unused NodeType.

9186a34

ruff check/format

ad4b6c3

robfalck requested a review from ewu63 December 12, 2025 14:32

Ruff format on test

059437a

Add autodoc_pydantic to doc requirements.txt.

aee422e

robfalck added 2 commits December 23, 2025 10:50

eof fix

359441f

docs build without warning

8c6cd13


		model_config = ConfigDict(arbitrary_types_allowed=True)

		def __init__(

Added pydantic backend for serialization, and other updates. #61

Are you sure you want to change the base?

Added pydantic backend for serialization, and other updates. #61

Uh oh!

Conversation

robfalck commented Oct 24, 2025

Purpose

Expected time until merged

Type of change

Testing

Checklist

Uh oh!

ewu63 commented Oct 24, 2025

Uh oh!

robfalck commented Oct 24, 2025

Uh oh!

ewu63 commented Nov 13, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ewu63 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

robfalck commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

robfalck commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

A-CGray commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

robfalck commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

robfalck commented Nov 13, 2025 •

edited

Loading

robfalck commented Dec 12, 2025 •

edited

Loading

A-CGray commented Dec 12, 2025 •

edited

Loading