remove `CachingStoreManager` from `factor-key-value` by dicej · Pull Request #2995 · spinframework/spin

dicej · 2025-01-27T21:39:47Z

The semantic (non-)guarantees for wasi-keyvalue are still under discussion, but meanwhile the behavior of Spin's write-behind cache has caused some headaches, so I'm removing it until we have more clarity on what's allowed and what's disallowed by the proposed standard.

The original motivation behind CachingStoreManager was to reflect the anticipated behavior of an eventually-consistent, low-latency, cloud-based distributed store and, per Hyrum's Law help app developers avoid depending on the behavior of a local, centralized store which would not match that of a distributed store. However, the write-behind caching approach interacts poorly with the lazy connection establishment which some StoreManager implementations use, leading writes to apparently succeed even when the connection fails.

Subsequent discussion regarding the above issue arrived at a consensus that we should not consider a write to have succeeded until and unless we've successfully connected to and received a write confirmation from at least one replica in a distributed system. I.e. rather than the replication factor (RF) = 0 we've been effectively providing up to this point, we should provide RF=1. The latter still provides low-latency performance when the nearest replica is reasonably close, but improves upon RF=0 in that it shifts responsibility for the write from Spin to the backing store prior to returning "success" to the application.

Note that RF=1 (and indeed anything less than RF=ALL) cannot guarantee that the write will be seen immediately (or, in the extreme case of an unrecoverable failure, at all) by readers connected to other replicas. Applications requiring a stronger consistency model should use an ACID-style backing store rather than an eventually consistent one.

The semantic (non-)guarantees for wasi-keyvalue are still [under discussion](WebAssembly/wasi-keyvalue#56), but meanwhile the behavior of Spin's write-behind cache has caused [some headaches](spinframework#2952), so I'm removing it until we have more clarity on what's allowed and what's disallowed by the proposed standard. The original motivation behind `CachingStoreManager` was to reflect the anticipated behavior of an eventually-consistent, low-latency, cloud-based distributed store and, per [Hyrum's Law](https://www.hyrumslaw.com/) help app developers avoid depending on the behavior of a local, centralized store which would not match that of a distributed store. However, the write-behind caching approach interacts poorly with the lazy connection establishment which some `StoreManager` implementations use, leading writes to apparently succeed even when the connection fails. Subsequent discussion regarding the above issue arrived at a consensus that we should not consider a write to have succeeded until and unless we've successfully connected to and received a write confirmation from at least one replica in a distributed system. I.e. rather than the replication factor (RF) = 0 we've been effectively providing up to this point, we should provide RF=1. The latter still provides low-latency performance when the nearest replica is reasonably close, but improves upon RF=0 in that it shifts responsibility for the write from Spin to the backing store prior to returning "success" to the application. Note that RF=1 (and indeed anything less than RF=ALL) cannot guarantee that the write will be seen immediately (or, in the extreme case of an unrecoverable failure, at all) by readers connected to other replicas. Applications requiring a stronger consistency model should use an ACID-style backing store rather than an eventually consistent one. Signed-off-by: Joel Dice <joel.dice@fermyon.com>

lann · 2025-01-27T21:43:06Z

How hard would it be to change the cache from write-behind to write-through (instead of removing it)?

dicej · 2025-01-27T21:50:43Z

How hard would it be to change the cache from write-behind to write-through?

Probably not hard. The question is: do we want caching at all, and if so, how much?

We can always bring it back if/when there's a clear need, but for now the simplest thing seems to be to just remove it. I also wonder if specific key-value implementations might want to enforce their own caching rules.

lann

Fair enough. I guess if you're performance-sensitive enough for this to matter then you probably ought to avoid immediately reading your writes anyway 🤷

dicej mentioned this pull request Jan 27, 2025

Key-Value Swallows Write Errors When Backing Impl Fails #2952

Closed

lann approved these changes Jan 27, 2025

View reviewed changes

itowlson approved these changes Jan 27, 2025

View reviewed changes

dicej merged commit abba902 into spinframework:main Jan 27, 2025
17 checks passed

rancherbot mentioned this pull request Mar 26, 2025

rddepman: bump spinCLI from 3.1.2 to 3.2.0 rancher-sandbox/rancher-desktop#8432

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

remove `CachingStoreManager` from `factor-key-value`#2995

remove `CachingStoreManager` from `factor-key-value`#2995
dicej merged 1 commit intospinframework:mainfrom
dicej:remove-caching-store-manager

dicej commented Jan 27, 2025

Uh oh!

lann commented Jan 27, 2025 •

edited

Loading

Uh oh!

dicej commented Jan 27, 2025

Uh oh!

lann left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

dicej commented Jan 27, 2025

Uh oh!

lann commented Jan 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dicej commented Jan 27, 2025

Uh oh!

lann left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lann commented Jan 27, 2025 •

edited

Loading