Skip to content

Conversation

@RagnarGrootKoerkamp
Copy link
Contributor

@RagnarGrootKoerkamp RagnarGrootKoerkamp commented Jan 3, 2026

Fix #19

@imartayan could you have a quick look to see if you're OK with how I added this to the builder API?

still todo to add some tests (will do that now).

cc @rchikhi

@RagnarGrootKoerkamp RagnarGrootKoerkamp force-pushed the syncmers branch 3 times, most recently from 1418ee6 to aef54af Compare January 3, 2026 11:50
@imartayan
Copy link
Collaborator

I think we could make a version of append_filtered_vals that directly takes a shuffle key, this way we could also use it in append_unique_vals_2 and have less duplicated code. Are you okay if I add this?
I'm also improving the NEON version a little bit.

@RagnarGrootKoerkamp
Copy link
Contributor Author

Sure, go ahead

@imartayan
Copy link
Collaborator

The CI tests are getting quite slow (4-5 min), maybe we should run them in release mode, or make minimizers_fwd and syncmers_simd_fwd a bit shorter.

@RagnarGrootKoerkamp
Copy link
Contributor Author

We could run the current ones in release, and then also run in dbg mode with fewer/smaller inputs (via some cfg(debug-assertions)).

But yeah just making those 2 smaller is also fine

@imartayan
Copy link
Collaborator

Okay, I'll see if I can make these two heavy tests release-only.

@imartayan
Copy link
Collaborator

Btw, do we also want to support open syncmers?

@RagnarGrootKoerkamp
Copy link
Contributor Author

Oh yes good idea. But even more API surface... 🤔

@imartayan
Copy link
Collaborator

We could make a collect_open_syncmer either with t as an additional parameter, or assume it's the middle position (which is the the most common I think).

@RagnarGrootKoerkamp
Copy link
Contributor Author

Yeah just assume it's the middle and assert that the right thing is odd.

But some code in the new syncmers.rs file probably needs duplicating. Or maybe collect_syncmersOPEN:boil (at least for internal use?)

@rob-p
Copy link

rob-p commented Jan 5, 2026

Regarding API surface area, I wonder if one could consider these many convenience functions to be primarily the external / FFI interface (e.g. for calling from "C"), and to have the Rust interface migrate toward a more minimal API with a configuration object? In general, when the number of different configurations starts to grow large, having a configuration object (usually constructed via the builder pattern with proper typestate management) seems to be a cleaner solution than letting the number of distinct named functions grow very large.

@RagnarGrootKoerkamp
Copy link
Contributor Author

RagnarGrootKoerkamp commented Jan 5, 2026

Yeah, we already have a builder, but I decided to add syncmers and canonical_syncmers as plain functions at the root level for discoverability, and since they change the meaning of the output. But note that both then return the same builder that other methods do.

On builders for synmers, we could add .open() and .closed() on the builder and set a generic based on that.

(The benefit of having these things as generics is that we avoid a bunch of codegen for all different variants. Otherwise we'd have to rely on inlining to optimize the other variants away, but surely that's slower? The drawback of course it that generics do make the API slightly more intense, but users won't see most of that.)

Would be perfect to have a enum Mode { Minimizer, OpenSyncmer, ClosedSyncmer }, but I don't think that's supported so SYNCMER: bool, OPEN_SYNCMER: bool maybe then? or CLOSED_SYNCMER: bool, OPEN_SYNCMER: bool?

@imartayan
Copy link
Collaborator

I prefer CLOSED_SYNCMER over just SYNCMER, more explicit :)

Another option is to use const bitflags, like I did in helicase. This reduces the number of generics (down to a hasher and a u64) but might be confusing for some users.

@RagnarGrootKoerkamp
Copy link
Contributor Author

I'd say bit flags is a bit too much of a change for now

@rob-p
Copy link

rob-p commented Jan 5, 2026

Wait, what do you mean an enum is not supported?

@RagnarGrootKoerkamp RagnarGrootKoerkamp merged commit 8f1d0d3 into master Jan 9, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support syncmers

4 participants