Skip to content

Prevent event reindexing (on same HS) by only allowing a cursor to change incrementally#662

Open
ok300 wants to merge 16 commits intopubky:mainfrom
ok300:ok300-prevent-hs-reindexing
Open

Prevent event reindexing (on same HS) by only allowing a cursor to change incrementally#662
ok300 wants to merge 16 commits intopubky:mainfrom
ok300:ok300-prevent-hs-reindexing

Conversation

@ok300
Copy link
Copy Markdown
Contributor

@ok300 ok300 commented Jan 2, 2026

This PR brings in ok300#12 .

@ok300 ok300 force-pushed the ok300-prevent-hs-reindexing branch from 41fa484 to 0691214 Compare January 2, 2026 14:40
@ok300 ok300 marked this pull request as ready for review January 2, 2026 15:50
@ok300 ok300 requested review from SHAcollision and tipogi January 2, 2026 15:50
Copy link
Copy Markdown
Collaborator

@SHAcollision SHAcollision left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ok300 , nice work!

Two concerns:

  1. Backwards compatibility: I am not sure about this, but maybe the previous Homeserver cursors were previously stored in Redis as strings because of their crockford encoding (e.g., "0000000000000") ?. With the field now defined as u32 and deserialized directly from Redis, any existing cursor will fail to deserialize, causing get_from_index/validate_cursor_change to error before processing. Maybe a data migration or a custom deserializer that accepts both strings and numbers is needed to avoid runtime failures during rollout. Have you tried running this fix over an existing homeserver?

  2. Cursor range maybe be too small?: The new u32 type limits cursor values to ~4.29B. If homeserver cursors can exceed this (the former 13-digit likely did), .parse::<u32>() will reject valid cursors and prevent further updates. Is the cursor on pubky-core/pubky-homeserver also u32 ? We could consider u64 (or string+numeric parsing) if we might sometimes expect larger cursors.

claude and others added 3 commits January 12, 2026 10:54
Add custom serde deserializer for the cursor field to handle backwards
compatibility with old cached data where cursor was stored as a string
(e.g., "0000000000000"). This allows deserialization from Redis to work
with both legacy string format and the new u32 numeric format.
…g-ko5Qm

Support deserializing Homeserver cursor from both string and number
@ok300
Copy link
Copy Markdown
Contributor Author

ok300 commented Jan 12, 2026

To your 1st point:

As part of the Postgres change, the HS also migrated its cursor logic such that the /events endpoint accepts both timestamp- and incremental-event-ID-based cursors, but returns the new cursor in the later, incremental ID format.

High-level view of the flow:

  1. Nexus calls ?cursor={timestamp}
  2. The HS recognizes the timestamp and converts it into the corresponding event ID
  3. HS process the request as usual
  4. HS returns the new event ID cursor
  5. Nexus persists it and uses it as starting cursor in subsequent requests

In other words, when the HS version with the Postgres change went live, cursors were seamlessly migrated to the integer format. So no timestamp-based cursors remain in Nexus.

As a live example, see https://homeserver.pubky.app/events/?cursor=0032W24W13DBW&limit=10 (it returns an integer as next cursor).

Have you tried running this fix over an existing homeserver?

Yes, I ran it against the Prod HS and it works.

I also tested it against a HS with no events, where the Nexus Redis already stored its cursor as the string "0000000000000". WIth the original PR it failed, but it's fixed in the latest commit da5e471 .

It's important to note this won't work with old, pre-Postgres homeservers, as they're not using incremental event IDs.

@ok300
Copy link
Copy Markdown
Contributor Author

ok300 commented Jan 12, 2026

To your second point:

Is the cursor on pubky-core/pubky-homeserver also u32 ?

It's i64 (link) probably due to the Postgres schema. This means the positive range is effectively u32.

If the HS gets a larger range for the event ID, we can easily adopt it.

@ok300 ok300 force-pushed the ok300-prevent-hs-reindexing branch from 09f19f7 to 0a7c417 Compare January 12, 2026 12:51
@ok300 ok300 force-pushed the ok300-prevent-hs-reindexing branch from 0a7c417 to 58724ea Compare January 12, 2026 13:32
re-indexing

A homeserver's cursor is not allowed to decrease, which is what would be
needed for this re-indexing test portion.
@tipogi tipogi added the 👀 watcher Nexus indexer related operations label Feb 12, 2026
@tipogi tipogi added this to the 2026-Q1 milestone Feb 12, 2026
@ok300 ok300 marked this pull request as draft March 11, 2026 10:15
@ok300 ok300 force-pushed the ok300-prevent-hs-reindexing branch from a9541fb to d36b923 Compare March 11, 2026 10:34
@ok300 ok300 marked this pull request as ready for review March 11, 2026 11:19
///
/// This handles backwards compatibility with old data where cursor was stored as a string
/// (e.g., `"0000000000000"`), while also supporting the new numeric format.
fn deserialize_cursor<'de, D>(deserializer: D) -> Result<u64, D::Error>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered migrating data in redis instead of supporting both formats ?

match Homeserver::try_from_cursor(id, cursor) {
match Homeserver::try_from_cursor(id, cursor).await {
Ok(hs) => hs.put_to_index().await?,
Err(e) => warn!("{e}"),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now this is going to fail each time, unless homeserver somehow returns cursor that is higher that currently stored.

Should we have a metric and get notified that syncing is stalled because of it ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the HS returns event lines, the last one will have a cursor that is higher than given cursor. For example https://homeserver.pubky.app/events/?cursor=150000&limit=1

If the given cursor doesn't exist, the HS returns no lines : https://homeserver.pubky.app/events/?cursor=250000&limit=1

this is going to fail each time, unless homeserver somehow returns cursor that is higher that currently stored

So it seems to me the condition you describe won't happen, because the HS always returns a higher cursor than the current one (arg), else returns no lines at all.

Or am I missing smth?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

👀 watcher Nexus indexer related operations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants