Skip to content

theo-exp#6011

Draft
theosanderson wants to merge 19 commits intomainfrom
seq-exp
Draft

theo-exp#6011
theosanderson wants to merge 19 commits intomainfrom
seq-exp

Conversation

@theosanderson
Copy link
Member

@theosanderson theosanderson commented Feb 21, 2026

🚀 Preview: Add preview label to enable

@theosanderson theosanderson added the preview Triggers a deployment to argocd label Feb 21, 2026
@claude claude bot added the deployment Code changes targetting the deployment infrastructure label Feb 21, 2026
@theosanderson theosanderson added preview Triggers a deployment to argocd and removed preview Triggers a deployment to argocd labels Feb 21, 2026
theosanderson and others added 15 commits February 27, 2026 10:43
…lus toggle

Add Dockerfile, CI workflow, and Helm chart changes to deploy sequlus
(Rust-based LAPIS+SILO replacement) as a single deployment serving all
organisms. Gated behind `useSequlus: false` so it can be enabled gradually.

- Dockerfile: multi-stage Rust 1.93 build for lapis-rs binary
- CI: build and push ghcr.io/loculus-project/sequlus image
- Helm: sequlus Deployment with initContainers for reference genomes,
  Postgres creds from database secret, probes on /{organism}/sample/info
- Helm: single ClusterIP Service on port 8090
- Ingress: route to sequlus without prefix stripping when toggle on
- Guard LAPIS+SILO templates with `if not .Values.useSequlus`
- Internal LAPIS URLs point to sequlus when toggle on
- lapis-silo-database-config ConfigMaps remain unconditional

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rename the directory from lapis-rs/ to sequlus/ and update the package
name, binary name, CLI command name, version strings, Dockerfile, and
CI workflow paths to use the sequlus name consistently.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… CI image wait

- Use debian:trixie-slim runtime to match rust:1.93-slim builder (fixes GLIBC_2.38
  and CXXABI_1.3.15 missing errors that caused silent container crash)
- Add libssl3t64 to runtime image for Postgres TLS
- Add startupProbe (10min timeout) so sequlus has time to complete ETL before
  readiness/liveness probes kick in
- Add step in integration tests to wait for sequlus Docker image before deploying
  (sequlus build takes ~15min, was racing with test deployment)
- Add sequlus/** to integration test path triggers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add /health endpoint that returns 200 regardless of ETL state
- startupProbe uses /health (passes quickly once server starts)
- readinessProbe uses /{organism}/sample/info (ready only after ETL)
  with high failureThreshold (120) to allow time for data loading
- livenessProbe uses /health (won't kill pod during ETL)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Integration tests now wait up to 20 minutes for sequlus image
  (build takes ~15 min, previous 10 min timeout was insufficient)
- Add main branch cache fallback for PR builds
- Add Rust dependency pre-build layer in Dockerfile for faster
  incremental builds

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The readiness probe was using /{organism}/sample/info which returns
404 when ETL hasn't completed. Since sequlus retries ETL every 5
minutes, the pod would be Not Ready for up to 5 minutes after startup
if the backend wasn't available during initial ETL. Using /health
makes the pod Ready as soon as the HTTP server starts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When sequlus starts before the backend is ready, all initial ETL
requests fail. Since failed organisms are never retried (the refresh
loop only checks already-loaded organisms), sequlus ends up serving
404s permanently until restarted.

Adding an initContainer that waits for the backend health endpoint
ensures sequlus always starts with a working backend connection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The default 300-second interval is too slow for integration tests
that submit sequences and expect them to appear in search within
60 seconds. A 10-second interval ensures sequlus picks up new data
quickly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The website requests /sample/lineageDefinition/:column but sequlus
didn't serve this endpoint, causing 404 console errors that fail
Chromium integration tests (which treat any console error as a
test failure). Returns an empty object for now.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Handle groupId, versionStatus, isRevocation, and other system fields that
are stored as database columns rather than in processed_data metadata JSON.
Previously these filters always returned 0 results because the fields don't
exist in sepd.processed_data->'metadata'. Now they map to proper SQL
conditions on sequence_entries and groups_table columns.

Also enriches the details response with system fields (accessionVersion,
groupId, groupName, submitter, versionStatus, etc.) by building a complete
JSON that merges system columns with preprocessing metadata.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ge definitions

- Handle array-valued accession/accessionVersion params in system field filters
- Add Content-Disposition header for FASTA downloads when downloadAsFile=true
- Implement dataUseTerms filter via LATERAL JOIN to data_use_terms_table
- Include dataUseTerms in METADATA_SELECT response
- Add lineage definition endpoint: download/parse YAML from configured URLs
- Pass LINEAGE_CONFIG env var from Helm template to sequlus

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Switch details endpoint from Postgres to DuckDB for metadata retrieval,
  fixing missing file references in search results (file-sharing tests)
- Handle both JSON and form-urlencoded POST bodies across all endpoints,
  fixing browser form submissions for large download queries
- Case-insensitive dataFormat parameter matching (TSV/tsv both work)
- Remove unused Postgres metadata query code (METADATA_SELECT, get_metadata_details)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
submissionId is stored as a top-level column (submission_id) in Postgres,
not inside processed_data->'metadata'. Add it to SYSTEM_FIELDS and
handle_system_filter. Also fix regex handling to use system field column
expressions instead of skipping them entirely.

Remove debug logging from previous commit.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
theosanderson and others added 4 commits February 27, 2026 11:12
The startup probe was timing out (5 min limit) when loading large datasets
like west-nile (164MB). Fix: pre-create OrganismStores with empty in-memory
DuckDB databases (or reuse existing files), start serving /health immediately,
and run the initial ETL in a background task. The 10-second refresh loop
picks up data changes once the backend is ready.

Server now starts listening in under 1 second regardless of data volume.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… deployment

The sequlus init containers were doing a raw `cp` of reference_genomes.json
from the configMap, which contains unresolved `[[URL:...]]` placeholders.
This caused mutations to be computed against the literal URL strings instead
of actual sequences, producing garbled mutation data (e.g. `GPC:[1M` instead
of `GPC:M1K`).

Now uses the config-processor init container (same as SILO deployment) to
download and inline the actual sequences before copying to the reference
genomes volume.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…upport

- Add gzip and zstd compression for all response types (FASTA, JSON, CSV, TSV)
- Add fastaHeaderTemplate support using {fieldName} placeholders for custom FASTA headers
- Add CSV_WITHOUT_HEADERS and TSV_WITHOUT_HEADERS dataFormat variants
- Set appropriate Content-Encoding and Content-Disposition headers for compressed downloads

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ile boolean handling

- Add missing /referenceGenome GET endpoint that returns nucleotideSequences and genes
- Add Serialize derive to ReferenceGenomes and NamedSequence types
- Fix FASTA Content-Type to include charset=UTF-8, matching LAPIS behavior
- Fix downloadAsFile to handle both JSON boolean true and string "true"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@theosanderson theosanderson added preview Triggers a deployment to argocd and removed preview Triggers a deployment to argocd labels Mar 1, 2026
@theosanderson theosanderson removed the preview Triggers a deployment to argocd label Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deployment Code changes targetting the deployment infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant