Simple multi-node (2) setup #321

gnieser · 2025-02-25T14:20:57Z

gnieser
Feb 25, 2025

Hello,

I would like to deploy mosparo on Kubernetes/OpenShift with at least two pods.
The main consideration is availability, so mosparo keeps running while a Kubernetes node is updating.

The multi-node setup documentation comes in handy and indicates various considerations:

Load-balancer: comes standard with Kubernetes service and ingress.
Cronjobs: easily solved by disabling cron inside the containers and rely on Kubernetes CronJob.
Synchronized files:
- config/env.mosparo.php and var/data can be provided through PersistentVolumes granted the storage supports ReadWriteMany access mode.
- public/resources one can also opt for storage or for cache
Shared cache: memcached or redis.
- memcached does not synchronize between nodes, so I don't know whether it is safe to use in this context (if synchronization is only used from cronjob, it could work fine)
- redis is more complex to setup, especially because a single master instance will not be sufficient for availability during a Kubernetes node update. Thus, it would probably require a 6 Redis instances with storage nonetheless.

Am I missing anything? Is the replicated shared cache a necessity for multi-node setup?

Thanks
Best regards

Answered by zepich

Mar 2, 2025

Hi @gnieser

The new version, v1.3.4, is released.

Now you can specify a path in the FILESYSTEM_CACHE_PATH environment variable, which mosparo will use for the shared file system cache.

Synchronize that directory between the nodes, and it should work as expected.

Once again, sorry for the misinformation. Looking forward to your feedback.

Kind regards,

zepich

View full answer

zepich · 2025-02-26T06:00:06Z

zepich
Feb 26, 2025
Maintainer

Hi @gnieser

Thank you very much for your question.

Your setup sounds perfect! You didn't mention the database, but other than that, you didn't miss anything.

I would always recommend synchronizing public/resources instead of using the shared cache (direct file access is faster than PHP processing the request and accessing the cache).

Shared cache

That's a bit of a complicated topic.

No, a shared cache is not necessary as long as you synchronize the cache between all nodes (the best is to synchronize the whole var directory).

The problem in a multi-node setup without a synchronized cache is that each node has a different database cleanup timestamp and will execute that cleanup anyway. The reason for that is that the frontend API will trigger the cleanup automatically, even if you use the cronjob to clean it at a specific time.

If you synchronize the file cache between the nodes, you should have no problem since the timestamp is synchronized. The next issue can be that the frontend API controller executes the cleanup at the same time since the locking of the cleanup is also done in the cache, and that was not synchronized in time. For this problem, we have the environment variable MOSPARO_CLEANUP_GRACE_PERIOD_ENABLED (set this to 1), which will delay the cleanup in the frontend controller for 24 hours, so your cronjob can clean the database once and then the new timestamp can be synchronized to the nodes again and everybody is happy... :)

Since file synchronization is complex (if you do it by yourself, not with Kubernetes), especially fast (regarding how long it takes to synchronize the file after creating/updating it), a shared cache can be an excellent alternative. Memcached is pretty easy since it's distributed without a master/replica setup. But, on the other hand, the cache data is not synchronized between the Memcached nodes. For this, you could use mcrouter. Or accept that one of the memcached servers can forget the data. In that case, the data will be recreated. You could even use only one Memcached server since it is not critical to run mosparo (see note below).

Future

In the future, we will have other use cases where we can/should use the shared cache. One use case is to store the uploaded file for the import functionality. If you have multiple nodes and no sticky session in the load balancer, the file may not be synchronized in time to the other node to process the import functionality, so the import will fail. If the file is stored in the shared cache, all servers can access it.

Summary

So, long story short: You do not need a shared cache, but you should ensure you synchronize the cache (var/cache) between the nodes (or the whole var directory).

If you don't do that, the most significant negative effect it can have is that the cleanup logic will be executed on both nodes (so it's not a dramatic problem, honestly). But in this scenario, you should execute the cleanup cron job on both nodes independently.

Note

While writing this response, I found a problem in the cleanup logic. It will pass through the Exception if the Memcached/Redis server is unavailable, leading to error 500 in the frontend API. I will add a fix for that in v1.3.4.

Additionally, I saw that Memcached offers an option to create automatic replicas from your data. I will investigate that, too.

I also just discovered that we clear the cache after updating mosparo, so we also remove the timestamp at which the next cleanup should be executed. We may have to think about storing that information in the database instead.

I hope my answer helps you. Let me know if you have more questions.

Kind regards,

zepich

8 replies

gnieser Feb 27, 2025
Author

Hi @zepich

I'm experimenting with two replicas, everything synchronized over NFS shares, using the unprivileged image.

It does not start cleanly. One replica or the other is always failing. I've witness the other pod restarting fine and the first failing then, not sure whether it's a coincidence or not. After a while it seems to stabilize.

I'm curious about how each pod reacts to the entrypoint rm -rf /mosparo/var/cache/prod being executed by the other pod :)

I hope these logs are relevant to the issue, slightly cleaned for readability.

First we have the kube-probe hitting a 499 status
Some errors :(
And then it's good.

[27/Feb/2025:10:17:12 +0100] "GET /api/v1/health/check HTTP/1.1" 499 0 "-" "kube-probe/1.29"

NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [info] User Deprecated: Since doctrine/doctrine-bundle 2.12: The default value of "doctrine.orm.controller_resolver.auto_mapping" will be changed from `true` to `false`. Explicitly configure `true` to keep existing behaviour.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [info] User Deprecated: Since doctrine/doctrine-bundle 2.13: Enabling the controller resolver automapping feature has been deprecated. Symfony Mapped Route Parameters should be used as replacement.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [info] User Deprecated: Since nelmio/security-bundle 3.4.0: The "xss_protection" option is deprecated, use Content Security Policy without allowing "unsafe-inline" scripts instead.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [info] User Deprecated: Since nelmio/security-bundle 3.4.0: The "xss_protection" option is deprecated, use Content Security Policy without allowing "unsafe-inline" scripts instead.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [critical] Uncaught Exception: "Symfony\Bridge\Doctrine\CacheWarmer\ProxyCacheWarmer::warmUp()" should return a list of files or classes but "/mosparo/var/cache/prod/doctrine/orm/Proxies/__CG__MosparoEntitySubmitToken.php.f5ef21b82a27cf76cf66dae6" is none of them.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [info] User Deprecated: Since doctrine/doctrine-bundle 2.12: The default value of "doctrine.orm.controller_resolver.auto_mapping" will be changed from `true` to `false`. Explicitly configure `true` to keep existing behaviour.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [info] User Deprecated: Since doctrine/doctrine-bundle 2.13: Enabling the controller resolver automapping feature has been deprecated. Symfony Mapped Route Parameters should be used as replacement.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [info] User Deprecated: Since nelmio/security-bundle 3.4.0: The "xss_protection" option is deprecated, use Content Security Policy without allowing "unsafe-inline" scripts instead.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [info] User Deprecated: Since nelmio/security-bundle 3.4.0: The "xss_protection" option is deprecated, use Content Security Policy without allowing "unsafe-inline" scripts instead.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [critical] Uncaught Exception: "Symfony\Bridge\Doctrine\CacheWarmer\ProxyCacheWarmer::warmUp()" should return a list of files or classes but "/mosparo/var/cache/prod/doctrine/orm/Proxies/__CG__MosparoEntitySubmitToken.php.3635530e4b00dafad4211d8e" is none of them.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [info] User Deprecated: Since doctrine/doctrine-bundle 2.12: The default value of "doctrine.orm.controller_resolver.auto_mapping" will be changed from `true` to `false`. Explicitly configure `true` to keep existing behaviour.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [info] User Deprecated: Since doctrine/doctrine-bundle 2.13: Enabling the controller resolver automapping feature has been deprecated. Symfony Mapped Route Parameters should be used as replacement.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [info] User Deprecated: Since nelmio/security-bundle 3.4.0: The "xss_protection" option is deprecated, use Content Security Policy without allowing "unsafe-inline" scripts instead.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [info] User Deprecated: Since nelmio/security-bundle 3.4.0: The "xss_protection" option is deprecated, use Content Security Policy without allowing "unsafe-inline" scripts instead.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [critical] Uncaught Exception: "Symfony\Bridge\Doctrine\CacheWarmer\ProxyCacheWarmer::warmUp()" should return a list of files or classes but "/mosparo/var/cache/prod/doctrine/orm/Proxies/__CG__MosparoEntitySubmitToken.php.48eeb16d963f6128a6e2f7c1" is none of them.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [info] User Deprecated: Since doctrine/doctrine-bundle 2.12: The default value of "doctrine.orm.controller_resolver.auto_mapping" will be changed from `true` to `false`. Explicitly configure `true` to keep existing behaviour.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [info] User Deprecated: Since doctrine/doctrine-bundle 2.13: Enabling the controller resolver automapping feature has been deprecated. Symfony Mapped Route Parameters should be used as replacement.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [info] User Deprecated: Since nelmio/security-bundle 3.4.0: The "xss_protection" option is deprecated, use Content Security Policy without allowing "unsafe-inline" scripts instead.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [info] User Deprecated: Since nelmio/security-bundle 3.4.0: The "xss_protection" option is deprecated, use Content Security Policy without allowing "unsafe-inline" scripts instead.
NOTICE: PHP message: 2025-02-27T09:17:13+00:00 [critical] Uncaught Exception: "Symfony\Bridge\Doctrine\CacheWarmer\ProxyCacheWarmer::warmUp()" should return a list of files or classes but "/mosparo/var/cache/prod/doctrine/orm/Proxies/__CG__MosparoEntitySubmitToken.php.3635530e4b00dafad4211d8e" is none of them.

127.0.0.1 -  27/Feb/2025:10:16:37 +0100 "GET /index.php" 200

Kind regards!

zepich Mar 1, 2025
Maintainer

Hi @gnieser

Thank you very much for your feedback. I'll investigate this!

Kind regards,

zepich

zepich Mar 1, 2025
Maintainer

Hi @gnieser

I've investigated the issue. While doing that, I discovered I had told you something wrong.

The cache data that should be synchronized between the nodes (let's call it "shared cache", which could be stored in Memcached or Redis) is not stored in var/cache. The directory var/cache only contains the Symfony core cache. We should not synchronize this core cache between the nodes. As you mentioned, deleting the cache is probably why you discovered the issue.

But where is the shared cache stored instead? By default, the shared filesystem cache is stored in the system's temporary directory, for example, /tmp/ on Linux / Ubuntu. I did not understand that part of the caching functionality. To ensure that both nodes have the correct timestamps, you have to synchronize that shared cache directory in /tmp/ right now. But you do not want to synchronize /tmp/ between the nodes (since there could be other stuff in it not from mosparo), so we're adding an environment variable with which you can specify a shared cache directory. You then have to synchronize (or use NFS) this directory between the nodes.

Since we have simple binary data in this directory, there should be no issue like the one you had before.

I will let you know when we release this new version (v1.3.4).

I'm very sorry for the wrong information that I told you and for the work that was generated.

Kind regards,

zepich

zepich Mar 2, 2025
Maintainer

Hi @gnieser

The new version, v1.3.4, is released.

Now you can specify a path in the FILESYSTEM_CACHE_PATH environment variable, which mosparo will use for the shared file system cache.

Synchronize that directory between the nodes, and it should work as expected.

Once again, sorry for the misinformation. Looking forward to your feedback.

Kind regards,

zepich

Answer selected by gnieser

gnieser Mar 3, 2025
Author

Hi @zepich

Thanks for the update!

Please don't worry about "wrong information" it's all fine for me and I've learnt a lot in the process :)

To summarize, starting with v1.3.4, the files/directories to synchronize are:

config/env.mosparo.php
var/data (but not var\cache)
public/resources
the shared cache denoted by FILESYSTEM_CACHE_PATH environment variable

It now starts without error with 2 replicas.
Best regards

zepich Mar 4, 2025
Maintainer

Hi @gnieser

Thank you very much for your feedback!

Please don't worry about "wrong information" it's all fine for me and I've learnt a lot in the process :)

Me too! :)

It now starts without error with 2 replicas.

Perfect! I'm very happy to hear that! I will update the Multi-Node documentation to add the option with the filesystem cache shared with NFS.

Kind regards,

zepich

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mosparo

Simple multi-node (2) setup #321

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 8 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

mosparo

Simple multi-node (2) setup #321

Uh oh!

gnieser Feb 25, 2025

Replies: 1 comment · 8 replies

Uh oh!

zepich Feb 26, 2025 Maintainer

Shared cache

Future

Summary

Note

Uh oh!

gnieser Feb 27, 2025 Author

Uh oh!

zepich Mar 1, 2025 Maintainer

Uh oh!

zepich Mar 1, 2025 Maintainer

Uh oh!

zepich Mar 2, 2025 Maintainer

Uh oh!

gnieser Mar 3, 2025 Author

Uh oh!

zepich Mar 4, 2025 Maintainer

gnieser
Feb 25, 2025

Replies: 1 comment 8 replies

zepich
Feb 26, 2025
Maintainer

gnieser Feb 27, 2025
Author

zepich Mar 1, 2025
Maintainer

zepich Mar 1, 2025
Maintainer

zepich Mar 2, 2025
Maintainer

gnieser Mar 3, 2025
Author

zepich Mar 4, 2025
Maintainer