Skip to content

Autoscaling / Downsaling results in Bad Gateway 502 Errors #888

@tiran133

Description

@tiran133

Hi,

I'm running this latest chart version 0.7.0 with ocis 7.1.1 and following values - see below.
External NATS and user management keyloackl and ldap.

I deployed the NATS from this example https://github.com/owncloud/ocis-charts/tree/main/deployments/ocis-nats

Every time the autoscaler scales down/or I manually kill all storageusers pods and uploading a file at the same time, it results in a Bad Gateway 502 Error

Edit:
The same happens with the proxy deployment/pods. owncloud/ocis#11170

[Uppy] [12:58:40] tus: unexpected response while uploading chunk, originated from request (method: PATCH, url: https://cloud.kube.domain.io/data/eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhdWQiOlsicmV2YSJdLCJleHAiOjE3NDMyMDg4NjIsImlhdCI6MTc0MzEyMjQ2MiwidGFyZ2V0IjoiaHR0cDovL3N0b3JhZ2V1c2Vyczo5MTU4L2RhdGEvdHVzLzgyMjU4Y2U3LTNjNGEtNGQxMS1iNzcxLWRlZWJjMmM4M2NhNiJ9.GXxihlhBuRTT4T6iD6fsVpj5i7l5P0f0XLLW9nef9bQ, response code: 500, response text: , request id: 72ccffe2-a8fc-42fe-870d-f0e6b55fe723)

The frontend service logs following error

2025-03-28T02:58:34Z DBG sending request to internal data server line=github.com/cs3org/reva/v2@v2.27.7/internal/http/services/datagateway/datagateway.go:187 pkg=rhttp request-id=72ccffe2-a8fc-42fe-870d-f0e6b55fe723 service=frontend target=http://storageusers:9158/data/tus/82258ce7-3c4a-4d11-b771-deebc2c83ca6 traceid=c7dd50c8ec14f38d84cf13ee15e466c3
2025-03-28T02:58:40Z ERR error doing PATCH request to data service error="Patch \"http://storageusers:9158/data/tus/82258ce7-3c4a-4d11-b771-deebc2c83ca6\": read tcp 10.42.7.158:58142->10.43.32.181:9158: read: connection reset by peer" line=github.com/cs3org/reva/v2@v2.27.7/internal/http/services/datagateway/datagateway.go:200 pkg=rhttp request-id=72ccffe2-a8fc-42fe-870d-f0e6b55fe723 service=frontend traceid=c7dd50c8ec14f38d84cf13ee15e466c3

Proxy

2025-03-28T02:58:40Z INF access-log bytes=0 duration=5909.739159 line=github.com/owncloud/ocis/v2/services/proxy/pkg/middleware/accesslog.go:34 method=PATCH path=/data/eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhdWQiOlsicmV2YSJdLCJleHAiOjE3NDMyMDg4NjIsImlhdCI6MTc0MzEyMjQ2MiwidGFyZ2V0IjoiaHR0cDovL3N0b3JhZ2V1c2Vyczo5MTU4L2RhdGEvdHVzLzgyMjU4Y2U3LTNjNGEtNGQxMS1iNzcxLWRlZWJjMmM4M2NhNiJ9.GXxihlhBuRTT4T6iD6fsVpj5i7l5P0f0XLLW9nef9bQ proto=HTTP/1.1 remote-addr=115.70.78.44 request-id=72ccffe2-a8fc-42fe-870d-f0e6b55fe723 service=proxy status=500 traceid=2a7294961371d75f889a9f4f07833c4a

It looks like the pod still receives traffic even in terminating state.
I'm using traefik as my ingress, is that a Problem should I use nginx?

Any ideas what can be done about it?

values.yaml

  values:
    tracing:
      enabled: false
      type: jaeger
      endpoint: jaeger.jaeger.svc.cluster.local:6831
      collector: jaeger.jaeger.svc.cluster.local:14268/api/traces
    resources:
       requests:
        cpu: 100m
        memory: 128Mi
    replicas: 1
    autoscaling:
      enabled: true
      minReplicas: 2
      maxReplicas: 10
    logging:
      level: "debug"
      pretty: true
      color: true
    externalDomain: cloud.kube.domain.io
    ingress:
      enabled: true
      ingressClassName: traefik
      annotations:
        traefik.ingress.kubernetes.io/router.tls: "true"
        traefik.ingress.kubernetes.io/router.entrypoints: websecure
      tls:
        - secretName: tls-ocis-ingress
          hosts:
            - cloud.kube.domain.io
    insecure:
      oidcIdpInsecure: true
      ocisHttpApiInsecure: true
    secretRefs:
      ldapSecretRef: ldap-bind-secrets
      s3CredentialsSecretRef: ocis-s3secret
    messagingSystem:
      external:
        enabled: true
        cluster: "ocis-cluster"
        endpoint: nats.ocis-nats.svc.cluster.local:4222
        tls:
          insecure: true
          enabled: false

    registry:
      type: nats-js-kv
      nodes:
        - nats.ocis-nats.svc.cluster.local:4222

    store:
      type: nats-js-kv
      nodes:
        - nats.ocis-nats.svc.cluster.local:4222

    cache:
      type: nats-js-kv
      nodes:
        - nats.ocis-nats.svc.cluster.local:4222

    features:
      externalUserManagement:
        enabled: true
        adminUUID: "ddc2004c-0977-11eb-9d3f-a793888cd0f8"
        autoprovisionAccounts:
          enabled: true
          claimEmail: email
          claimDisplayname: name
          claimGroups: groups
          claimUserName: preferred_username
        oidc:
          issuerURI: https://keycloak.kube.domain.io/realms/oCIS
          userIDClaim: sub
          userIDClaimAttributeMapping: userid
          roleAssignment:
            enabled: true
            claim: roles
        ldap:
          writeable: true
          uri: ldap://openldap.openldap.svc.cluster.local:389
          insecure: true
          bindDN: cn=admin,dc=domain,dc=io
          user:
            schema:
              id: ownCloudUUID
            baseDN: ou=users,dc=domain,dc=io
          group:
            schema:
              id: ownCloudUUID
            baseDN: ou=groups,dc=domain,dc=io
    services:
      search:
        persistence:
          chownInitContainer: true
          enabled: true
      storagesystem:
        persistence:
          enabled: true
          size: 5Gi
          accessModes:
            - ReadWriteMany
          storageClassName: nfs-client
      storageusers:
        persistence:
          enabled: true
          size: 20Gi
          accessModes:
            - ReadWriteMany
          storageClassName: nfs-client
        storageBackend:
          driver: s3ng
          driverConfig:
            s3ng:
              endpoint: https://s3
              bucket: ocis-kube
              putObject:
                disableMultipart: true
      thumbnails:
        resources:
          limits:
            memory: 1Gi
          requests:
            cpu: 100m
            memory: 1Gi
        persistence:
          enabled: true
          size: 5Gi
          accessModes:
           - ReadWriteMany
          storageClassName: nfs-client
      web:
        persistence:
          enabled: true
          size: 1Gi
          accessModes:
            - ReadWriteMany
          storageClassName: nfs-client
        config:
          oidc:
            webClientID: web

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions