-
Notifications
You must be signed in to change notification settings - Fork 46
Added pydantic backend for serialization, and other updates. #61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
This is pretty great, thanks for putting it together Rob! I think it has been a longstanding goal to have a stable declarative syntax for XDSM diagrams, and it's a major shortcoming of the current Python API. It will take me some time to review this PR, but I do want to pose a question initially -- how do we feel about a Pydantic-based JSON representation vs. something a bit more custom? I understand that from a ease of development perspective Pydantic is great and very stable, but this effectively locks in the Pydantic class definitions --- any future changes such as adding/removing/renaming fields will mean that the previously-generated JSON files are invalid, and shims (likely in the form of model_validators) have to be added to maintain compatibility. On the other hand, a custom format may be more compact/readable, and we could have a bit more flexibility in serialization/deserialization, though we lose out on all the great Pydantic features. And I suppose some compat layer has to be created no matter what we do.. just curious for other's thoughts here. Lastly, I want to mention that there exists XDSMjs as a javascript library, which of course has its own existing JSON representation of the XDSM diagram. Maybe there are opportunities to standardize and define a community-driven JSON schema that could be used by various tools/backends. CC @relf @eirikurj and others, if there are lots of discussions we can move this to a dedicated thread. |
|
I've been diving into pydantic during this government shutdown. It seems like it has fairly broad adoption so I'm not concerned with it losing support. I think the benefits it provides as far as providing serialization and validation are really worthwhile vs a custom format. As far as merging with XDSMjs, that would be a discussion worth having. |
|
Sorry, the concern I had is not about pydantic as a package, but the particular schema in this repo gets locked in via any serialized JSON files, and changes to those pydantic class definitions will not be backwards compatible. It's something we can get around with, via validators and such, but it is cumbersome. I'll try to review the actual code, and I don't have any fundamental objections to using the class definitions as-is but I was just wondering if there are any improvements we want to do---best to do them now before they are fixed by the JSON. |
pyxdsm/matrix_eqn.py
Outdated
| import subprocess | ||
| from collections import namedtuple | ||
| import numpy as np | ||
| from typing import Dict, List, Optional, Tuple, Union |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dict, List, and Tuple are deprecated since python 3.9 in favour of builtins dict etc. Unions can also be written via | starting with 3.10.
pyxdsm/matrix_eqn.py
Outdated
| class TotalJacobian(BaseModel): | ||
| """Total Jacobian matrix representation.""" | ||
|
|
||
| variables: Dict[str, Variable] = Field(default_factory=dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default_factory is not needed for pydantic models, it will automatically do a deep copy on initialization. Simply writing variables: dict[str, Variable] = {} will work
pyxdsm/matrix_eqn.py
Outdated
|
|
||
| variables: Dict[str, Variable] = Field(default_factory=dict) | ||
| j_inputs: Dict[int, Variable] = Field(default_factory=dict) | ||
| n_inputs: int = Field(default=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 99% of the cases the Field constructor is not needed, you can simply put n_inputs: int = 0. The description (if you'd like) can go as a normal Python docstring with triple quotes just below the attribute definition.
pyxdsm/matrix_eqn.py
Outdated
| setup: bool = Field(default=False) | ||
|
|
||
| self._setup = False | ||
| model_config = ConfigDict(arbitrary_types_allowed=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this?
ewu63
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a bunch of comments. I'd also be happy to help clean up / push to this PR if you'd like
pyxdsm/matrix_eqn.py
Outdated
|
|
||
| self._connections = {} | ||
| self._ij_connections = {} | ||
| setup: bool = Field(default=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems that this is an internal variable not meant to be read/modified by the user or saved to the JSON. If so, perhaps we can keep it as _setup so 1) it does not get serialized, and 2) indicates to the user that they should not interact with this private variable
pyxdsm/matrix_eqn.py
Outdated
| terms: List[str] = Field(default_factory=list) | ||
|
|
||
| self._terms = [] | ||
| model_config = ConfigDict(arbitrary_types_allowed=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly do we need this?
pyxdsm/matrix_eqn.py
Outdated
| total_size: int = Field(default=0) | ||
|
|
||
| self._total_size = 0 | ||
| setup: bool = Field(default=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment on setup
pyxdsm/XDSM.py
Outdated
| model_config = ConfigDict(arbitrary_types_allowed=True) | ||
|
|
||
| @field_validator("side") | ||
| @classmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed since you specified side must be of type Side which is a literal with two options
pyxdsm/XDSM.py
Outdated
| AutoFadeOption = Literal["all", "connected", "none", "incoming", "outgoing"] | ||
|
|
||
| # Valid TikZ node styles (from diagram_styles.tikzstyles) | ||
| VALID_NODE_STYLES = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a similar Literal definition? That way it can be used in the type definition below
pyxdsm/XDSM.py
Outdated
|
|
||
| model_config = ConfigDict(arbitrary_types_allowed=True) | ||
|
|
||
| def __init__( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using __init__ here feels a little odd. I think implementing a model_validator is a more appropriate pattern if the goal is to just patch in data which is not stored in a JSON?
pyxdsm/XDSM.py
Outdated
| label_width: Optional[int] = None, | ||
| spec_name: Optional[str] = None, | ||
| ) -> None: | ||
| """Add a system block on the diagonal.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add back the appropriate Parameters section of the docstrings here and below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've cleaned this up a bit.
I removed any changes involving MatrixEquation/Jacobian because I wasn't really interested in having that be serializable with this PR.
I've added the original docstrings back in place.
The write_sys_specs is the original specs method, I left it in place with the to_json and from_json methods being the pydantic interface. Maybe there's not longer a reason for write_sys_specs but I didn't want to break any existing implementations.
| json_str = json.dumps(spec, indent=2) | ||
| f.write(json_str) | ||
|
|
||
| def to_json(self, filename: Optional[str] = None) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the older write_sys_specs is now deprecated? Do we want to make that clear to users? The docstrings here say "specification" which may be a little confusing with the other function name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think write_sys_specs will get much use if its mostly/completely duplicated by to_json/from_json.
I can add the deprecation warning if you like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I diffed this file with what you removed in XDSM.py, and saw that a lot of comments were removed for some reason. Can you add those back? There were lots of helpful comments previously
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, was there any particular reason for moving this? I understand that logically it could probably be a separate class, but the diffs are really hard to verify given the scope of this PR, so I wonder if we can do that in a subsequent PR to make review easier.
|
Since there's interest in this, I'll clean this up when I get a chance. With the furlough ending things will be a bit busy for a few days :) |
Revert matrix_eqn.py to non-pydantic version for now.
…ings or filenames.
|
I think I've addressed all of the comments, XDSM is considerably simpler now. Is the failing doc build a permissions issue that's not expected to pass? |
Looks like your changes add a new dependency to the docs build? There's a requirements.txt in the doc directory you should be able to add this to |
|
I'm adding the autodoc_pydantic requirement, but the preference for terse pydantic specifications (no Fields with descriptions) might make this less useful. I've debated whether documentation of the pydantic models is necessary, but I'm erring on the side of documenting more. |
Purpose
The MDO community at NASA use XDSMs pretty extensively to convey ideas, but we need a better way to turn OpenMDAO models in to XDSM diagrams. The intent of this update to pyXDSM is to allow it to ingest XDSM information in a declarative format to then produce the XDSM in PDF format. This means that tools like OpenMDAO would only need to be able to generate a compatible JSON file rather than interacting directly via the API.
This PR does the following:
Changes setup.py to pyproject.toml for PEP518 compatibility
Objects in the API are now built on top of Pydantic BaseModel
In particular, this leverages Pydantic for serialization/deserialization and validation.
This means, in addition to the existing imperative API, an XDSM model can now be created with a declarative syntax via a dictionary or JSON (or any other serialization that can be represented as a dictionary, such as yaml.)
The key methods here are
.to_json()and.from_json().The MDF and kitchen sink examples now have corresponding .json files. Tests have been added to validate the serialization and deserialization of XDSM objects.
__main__.pyhas been added to provide a command-line interface.This allows the user to quickly build a tikz, pdf, or json output file from a given json file as input. The JSON output should be the equivalent of a copy but it seemed appropriate to include it, in case we ever support any other type of file format (yaml, toml, etc).
This utility allows the user to specify the JSON file to be converted. If no output is specified, it will assume that a PDF is to be generated. It supports a
--cleanupand--quietoptions of the build method as arguments.The latex generation of XDSM was moved to a separate
xdsm_latex_writer.pyfile.Similar changes are not yet made to matrix equations, though I plan to work on that as a follow-up.
Docs have been updated.
Expected time until merged
This is not urgent. While I'm going to base some future work on this change, its a somewhat large PR and I understand if it takes some time to fully review it. I can work to my fork in the mean time.
Type of change
Testing
I've added tests to
test_xdsm.pythat test that a deserialized XDSM is equivalent.The example XDSMs have been added in JSON format and tests of the CLI have been implemented.
Checklist
ruff checkandruff formatto make sure the Python code adheres to PEP-8 and is consistently formattedfprettifyor C/C++ code withclang-formatas applicable