Skip to content

Conversation

@GregorShear
Copy link
Contributor

@GregorShear GregorShear commented Jan 9, 2026

Description: Adds updateStorageMapping GraphQL mutation for modifying existing storage mappings. Refactors shared validation and health check logic into reusable helper functions. The update response includes a republish field indicating whether the primary storage bucket changed, which signals that affected specs will need republishing.

Workflow steps: Call updateStorageMapping with the same input shape as createStorageMapping:

mutation {
  updateStorageMapping(
    catalogPrefix: "gregCo/"
    dryRun: false
    storage: {
      stores: [
        {provider: "GCS", bucket: "your-bucket", prefix: "collection-data/"}
      ]
      data_planes: ["ops/dp/public/local-cluster"]
    }
  ) {
    updated
    catalogPrefix
    republish
  }
}

Use dryRun: true to validate input and check if a republish would be required without persisting changes.

Notes for reviewers:

  • The republish field compares the incoming list of stores with the existing
  • The TODO comment about actually triggering republish remains - this PR only reports whether it's needed

@GregorShear GregorShear requested a review from psFried January 9, 2026 21:43
Copy link
Member

@psFried psFried left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great so far. I left one minor comment, but apart from that this should be good to go once it's rebased on top of the changes to the other PR

@GregorShear GregorShear force-pushed the greg/mutation/updateMapping branch from ca9af4c to ac89dcd Compare January 13, 2026 19:28
@GregorShear GregorShear marked this pull request as ready for review January 13, 2026 19:55
@GregorShear
Copy link
Contributor Author

republish logic updated and branch rebased on top of the other PR

@GregorShear GregorShear requested a review from psFried January 13, 2026 19:59
Copy link
Member

@psFried psFried left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a bit more feedback inline, but should be good after that

"updated storage mapping"
);

// TODO: if the primary (first) fragmentStore changed, republish the entire prefix.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: seems like the plan is for the client to handle the republication, so this can be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this gql operation need to be aware of the republishing process, or is that just handled elsewhere when changes are detected?

Base automatically changed from greg/mapping/mutation to master January 15, 2026 16:24
@GregorShear GregorShear force-pushed the greg/mutation/updateMapping branch from 8323a70 to 50cd8ec Compare January 15, 2026 16:45
…mutations

Split the single upsertStorageMapping GraphQL mutation into separate createStorageMapping and updateStorageMapping mutations with distinct behavior:

- createStorageMapping: fails if a mapping already exists, checks for existing specs that would be affected
- updateStorageMapping: fails if no mapping exists, returns whether a republish is needed due to store changes
@GregorShear GregorShear force-pushed the greg/mutation/updateMapping branch from 46431bd to cbb12a9 Compare January 20, 2026 15:40
Comment on lines +202 to +213
let sampled_specs = sqlx::query_scalar!(
r#"
SELECT catalog_name
FROM live_specs
WHERE starts_with(catalog_name, $1)
AND spec IS NOT NULL
LIMIT 5
"#,
&catalog_prefix,
)
.fetch_all(&mut *txn)
.await?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@psFried
okay to leave these inline or would you rather keep all queries in crates/control-plane-api/src/directives/storage_mappings.rs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think inline is generally fine, as long as it's not something we're trying to re-use elsewhere. For this particular query, inline seems fine. But we have some existing functions in crates/control-plane-api/src/directives/storage_mappings.rs that seem like they'd work for inserts and updates to storage mappings. I think it's worth using those, unless there's a reason not to.

sqlx::query!(
r#"
UPDATE storage_mappings
SET spec = $2, detail = $3, updated_at = now()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we be manually updating the updated_at value? i would expect this to be handled with an UPDATE trigger...

Copy link
Member

@psFried psFried Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we just handle those manually instead of with triggers. It's not something I feel especially strongly about, and we do still use triggers here and there for other things. Though I think my general position is to try to avoid triggers when we can, to make things more explicit.

That said, I think the existing upsert function is probably what we should use here instead of inline sql, and that already takes care of setting updated_at.

Comment on lines +425 to +436
sqlx::query!(
r#"
UPDATE storage_mappings
SET spec = $2, detail = $3, updated_at = now()
WHERE catalog_prefix = 'recovery/' || $1
"#,
&catalog_prefix as &str,
crate::TextJson(&recovery_storage) as crate::TextJson<&models::StorageDef>,
detail,
)
.execute(&mut *txn)
.await?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this recovery/ logic is right but would love some extra scrutiny here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comments about factoring out a function that returns both the storage defs from the original input, and about reusing the existing functions for updating the database. I'm thinking we should use upsert_storage_mapping, and then this inline query would not be necessary.

@GregorShear GregorShear requested a review from psFried January 20, 2026 15:44
Copy link
Member

@psFried psFried left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good start. I left some comments inline, mostly about trying to re-use the existing logic from directives::storage_mappings. LMK if you'd like to talk through any of that

Comment on lines +202 to +213
let sampled_specs = sqlx::query_scalar!(
r#"
SELECT catalog_name
FROM live_specs
WHERE starts_with(catalog_name, $1)
AND spec IS NOT NULL
LIMIT 5
"#,
&catalog_prefix,
)
.fetch_all(&mut *txn)
.await?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think inline is generally fine, as long as it's not something we're trying to re-use elsewhere. For this particular query, inline seems fine. But we have some existing functions in crates/control-plane-api/src/directives/storage_mappings.rs that seem like they'd work for inserts and updates to storage mappings. I think it's worth using those, unless there's a reason not to.

.cloned()
.map(|mut store| {
let prefix = store.prefix_mut();
*prefix = models::Prefix::new(format!("{prefix}collection-data/"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I foresee potential issues with this logic in update scenarios. The UI will fetch the existing storage mappings, which already have the collection-data/ prefix, and then pass those storage mappings to update, which would add the prefix again. So I'm thinking at a minimum we'd want to change the corresponding logic that's used during the update to check whether the prefix is there already. But also, this feels like it ought to be in a function that's shared between insert and update operations. Ideally, we'd have a pure function that accepts a single models::StorageDef, and returns a tuple of both the collection and recovery storage defs. Then we could continue to use that same function, even after we no longer persist the recovery mappings in postgres.

Comment on lines +425 to +436
sqlx::query!(
r#"
UPDATE storage_mappings
SET spec = $2, detail = $3, updated_at = now()
WHERE catalog_prefix = 'recovery/' || $1
"#,
&catalog_prefix as &str,
crate::TextJson(&recovery_storage) as crate::TextJson<&models::StorageDef>,
detail,
)
.execute(&mut *txn)
.await?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comments about factoring out a function that returns both the storage defs from the original input, and about reusing the existing functions for updating the database. I'm thinking we should use upsert_storage_mapping, and then this inline query would not be necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants