-
Notifications
You must be signed in to change notification settings - Fork 6
Add benchmarks #327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add benchmarks #327
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests.
🚀 New features to boost your workflow:
|
|
|
||
| def parse(self, input_file: IO[bytes]) -> None: | ||
| from pyjelly.integrations.generic.parse import parse_jelly_to_graph | ||
| from pyjelly.integrations.generic.parse import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't do irrelevant changes in PRs. Please revert this and other changes like this.
| "mypy>=1.8; platform_python_implementation == 'CPython'", | ||
| "hatchling>=1.24", | ||
| "hatch-mypyc; platform_python_implementation == 'CPython'", | ||
| "mypy>=1.8; platform_python_implementation == 'CPython'", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
| # version 3.1 required for python 3.14 support | ||
| ci = ["cibuildwheel>=3.1.0,<4 ; python_version >= '3.11'"] | ||
|
|
||
| # version 3.11 required for python 3.14 support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what?
|
|
||
| # version 3.11 required for python 3.14 support | ||
| ci = ['cibuildwheel>=3.1.0,<4 ; python_version >= "3.11"'] | ||
| bench = ["pytest-benchmark>=5.2.1", "rdflib>=7.1.4"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename to "benchmark" to make it clearer what this is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just "benchmarks" instead of "benchmark_tests"?
| "--in-jelly-quads", | ||
| type=str, | ||
| default=None, | ||
| help="optional Jelly quads file; if none, generated in-memory from nq slice.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is an "nq slice"?
| g.addoption("--iterations", type=int, default=1, help="iterations per round.") | ||
|
|
||
|
|
||
| def _slice_lines_to_bytes(path: Path, limit: int) -> bytes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no comments here or anywhere else, again. This makes the code rather hard to review.
Please make this code readable, and then I will review it again.
|
|
||
| @pytest.fixture(scope="session") | ||
| def nt_graph(nt_bytes_sliced: bytes) -> Graph: | ||
| g = Graph() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you even use Graph? For buffering in-memory you must use an array of statements, otherwise you will get nonsensical results. Same with Dataset, of course.
| pedantic_cfg: dict[str, int], | ||
| limit_statements: int, | ||
| ) -> None: | ||
| benchmark.pedantic(parse_nt_bytes, args=(nt_bytes_sliced,), **pedantic_cfg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are measuring here not the parsing speed, but the speed with which rdflib can insert stuff into the Graph. This is meaningless. You must only iterate over the resulting triples/quads, nothing else.
Added benchmarks for measuring the performance of flat ser/des to tests.