Optionally accept a regular expression pattern for exclude_paths#90
Optionally accept a regular expression pattern for exclude_paths#90foxmulder900 wants to merge 4 commits intoxlwings:masterfrom
Conversation
|
Ah looks like I'll need to update that type-hint to use a union to support older versions of python. Commit incoming.... |
jsondiff/__init__.py
Outdated
| """ | ||
| try: | ||
| # Assume a list is provided to avoid any performance degradation from type-checking. | ||
| return path in exclude_paths |
There was a problem hiding this comment.
Sorry, I didn't mean for you to optimize right away but I appreciate your attention to detail here :) We don't have official perf stats to meet, my concern was more of a "let's not pay for this feature when not using Pattern". That being said, I think this unfairly makes the Pattern path slower which might not be great for your use case.
What if instead of paying the try/except tax on each run we introduce a new type in the constructor, something like PathExcluder ? Then the right instance (list[str], Pattern, $future_thing) is always called if not None.
class PathExcluder:
def __init__(self, exclude_paths: typing.Union[list, Pattern]):
if isinstance(exclude_paths, Pattern):
self._check = exclude_paths.match
else:
# Could leave this as a list[str] or maybe make it a set while we're here. O(1) lookup.
exclude_set = set(exclude_paths)
self._check = exclude_set.__contains__
def should_exclude(self, path: str) -> bool:
return bool(self._check(path))
| @@ -320,14 +321,23 @@ def test_yaml_dump_string_fp(self): | |||
| self.assertEqual(expected, buffer.getvalue()) | |||
|
|
|||
| def test_exclude_paths(self): | |||
There was a problem hiding this comment.
I really like this pattern you have suggested, however unfortunately I believe it may be a breaking change. The exclude_paths option was not previously part of the constructor, instead it was a parameter to the JsonDIffer.diff method. So moving it to the constructor to support our new feature will cause problems if anyone was using the method directly (like this test was doing).
This also made me realize that the documentation is slightly incorrect, it shows an example of passing exclude_paths to the exposed diff function here but those kwargs are only passed into the JsonDiffer constructor, not JsonDiffer.diff as was previously expected before this PR.
So long story short, unless I'm misunderstanding something, this PR will make things work as the documentation states, but anyone that is currently using the exclude_paths feature has to be doing it in a roundabout way by instantiating the JsonDiffer class themselves and calling its diff method directly, which will no longer work.
There was a problem hiding this comment.
I suppose we could continue to accept the exclude_paths keyword arg in the JsonDiffer.diff method. When provided, log a deprecation warning and re-instantiate the PathExcluder with the new value. That would make the performance take a hit for anyone using it, but provide an easy path to the preferred/performant way. Then maybe it could go away entirely in a future 3.0 release.
There was a problem hiding this comment.
I like the solution you're proposing where we keep the exclude_paths param to avoid a breaking change.
@payam54 any concerns with this change?
Implements the following feature request:
#89