Implement Lazy Directory Parsing for Improved Deserialization Performance #13
Implement Lazy Directory Parsing for Improved Deserialization Performance #13expanded-for-real merged 40 commits intomainfrom
Conversation
…mance tracking; add comprehensive String benchmark
…to remove a bunch of stuff
…ing deserialization a bit faster
Try to enhance string deserialization
A full list of enhancements can be found here - #3
…ome micro-optimizations added that were found along the way
* Full comprehensive comparison tests with a lot of other libraries + some micro-optimizations added that were found along the way * replace deprecated gradle methods with latest --------- Co-authored-by: expand3d <>
# Conflicts: # src/jmh/java/com/imprint/benchmark/ComparisonBenchmark.java # src/main/java/com/imprint/core/ImprintRecord.java # src/main/java/com/imprint/types/TypeHandler.java # src/main/java/com/imprint/types/Value.java
…with nested objects
…es in gradle file. Also fix permission issue
…Comparison tests iterations manually
# Conflicts: # .github/workflows/ci.yml # build.gradle # src/jmh/java/com/imprint/benchmark/ComparisonBenchmark.java # src/main/java/com/imprint/core/ImprintRecord.java # src/main/java/com/imprint/types/TypeHandler.java # src/main/java/com/imprint/types/Value.java
There was a problem hiding this comment.
Massively reduced the ci compared to last time - probably won't have to change this much for a while
# Conflicts: # .github/workflows/ci.yml # build.gradle # src/jmh/java/com/imprint/benchmark/ComparisonBenchmark.java # src/main/java/com/imprint/core/ImprintRecord.java
There was a problem hiding this comment.
Massively reduced gradle from last time
There was a problem hiding this comment.
Cache VarInt values. Minor performance tweak that I came up with when running comparisons and was low hanging fruit.
There was a problem hiding this comment.
I put all the buffer handling logic in a new class to leave the ImprintRecord a bit cleaner (especially since I'm about to put merge and project in there too). I tried to comment it as best I could since the buffer handling logic can be confusing a took me a few tries before tests passed
agavra
left a comment
There was a problem hiding this comment.
skimmed the ImprintRecord changes but otherwise lgtm
| if (directoryCount >= 0) | ||
| return directoryCount; |
There was a problem hiding this comment.
minor, but it seems like overkill to cache the directory count - reading a single varint is pretty speedy
There was a problem hiding this comment.
Yeah I over-engineered that for the sake of avoiding directory hits but it ends up just being unnecessary branching here. Will remove in the next PR
| private final ByteBuffer payload; // Read-only payload view | ||
|
|
||
| // Lazy-loaded directory state | ||
| private List<DirectoryEntry> parsedDirectory; |
There was a problem hiding this comment.
why did you prefer a list to a map (idx -> entry) for efficient lookup when parsed? that way we can also lazily fill it in if we wanted to on every lookup (in addition to a single call to eagerly parse)
There was a problem hiding this comment.
Good question! The reason is because I didn't think of that lol. I think I mostly wanted to preserve insertion order on these.... but they're already sorted coming in anyways so that's a good idea. I'll implement the map on next PR
| return -1; | ||
| int low = 0; | ||
| int high = parsedDirectory.size() - 1; | ||
| while (low <= high) { |
There was a problem hiding this comment.
if we use a map this is just an O(1) lookup into the parsed dir
| if (fieldId > currentFieldId) | ||
| return offset; |
There was a problem hiding this comment.
shouldn't we binsearch here instead of go through field by field?
#11
Implemented lazy directory de-serializing. Times were a bit scattered on the runs but consistently saw strong results, and I saw as low as 250 ns/ops at one point for Imprint. Either way we were consistently at least twice as fast as before so I think the additional complexity is worth it.
Before
After
We should be in a good spot now to implement Merge and Project