Skip to content

Comments

Improve performance of n-quads parsing#95

Open
michaeladler wants to merge 4 commits intopiprate:masterfrom
michaeladler:perf/misc2
Open

Improve performance of n-quads parsing#95
michaeladler wants to merge 4 commits intopiprate:masterfrom
michaeladler:perf/misc2

Conversation

@michaeladler
Copy link
Contributor

This removes the use of regular expressions for parsing n-quads. As expected, this approach is significantly faster (over 30%) and results in cleaner code, thanks to the great work by the Cayley authors.

Motivation

Improve the performance of n-quads parsing.

Checks

  • Passes make test

Signed-off-by: Michael Adler <michael.adler@siemens.com>
cpu: AMD Ryzen 7 PRO 8840U
BenchmarkParseNQuadsFrom-16    	   68916	     16787 ns/op	    9857 B/op	      96 allocs/op
BenchmarkLoadNQuads-16         	 3774422	       320.8 ns/op

Signed-off-by: Michael Adler <michael.adler@siemens.com>
Benchmarks show this is about 70-130 times faster.

Signed-off-by: Michael Adler <michael.adler@siemens.com>
This is ~32% faster than the previous regex-based approach:

cpu: AMD Ryzen 7 PRO 8840U
BenchmarkParseNQuadsFrom-16    	  105422	     11395 ns/op	   11576 B/op	     150 allocs/op
BenchmarkLoadNQuads-16         	 3807912	       315.1 ns/op

Signed-off-by: Michael Adler <michael.adler@siemens.com>
@kazarena
Copy link
Member

@michaeladler, thank you for the PR. Will review asap.

@kazarena
Copy link
Member

Hi @michaeladler, the proposed change makes sense but I'm a bit concerned it creates a circular dependency between github.com/cayleygraph/quad and github.com/piprate/json-gold through JSON-LD encoder/decoder. One alternative would be to bring in the necessary from into json-gold codebase (with attribution), given its static nature, but it's not ideal either.

What are your thoughts?

@michaeladler
Copy link
Contributor Author

Hi @kazarena,

I noticed that dependency as well but I don't think it's a problem. In Golang, cyclic dependencies happen at the package-level, not at the repository-level.

json-gold:

$ go list -json -deps ./ld  | jq '.ImportPath' | grep cayley
"github.com/cayleygraph/quad/voc"
"github.com/cayleygraph/quad/voc/schema"
"github.com/cayleygraph/quad/voc/xsd"
"github.com/cayleygraph/quad"
"github.com/cayleygraph/quad/nquads"

cayleygraph:

$ go list -json -deps ./jsonld  | jq '.ImportPath' | grep json-gold
"github.com/piprate/json-gold/ld/internal/jsoncanonicalizer"
"github.com/piprate/json-gold/ld"

Also: if it actually was a problem, we would get a compile error. I just ran the tests of cayleygraph/quad with the go.mod pointing to my fork of json-gold and didn't get one :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants