Skip to content

Releases: cybergis/rs-embed

v0.1.3

13 Apr 04:00

Choose a tag to compare

Added

  • wildsat now supports variant selection via model_config={"variant": "..."} (or the variant= keyword). Available variants are vitb16 (default), resnet50, and swint, each backed by its own ImageNet-pretrained checkpoint that auto-downloads from Google Drive. The previous vitl16 arch has been removed as no upstream checkpoint exists. Users who need other pre-training initializations (CLIP, Prithvi, SatCLIP, etc.) can still point RS_EMBED_WILDSAT_CKPT at a local checkpoint. The RS_EMBED_WILDSAT_ARCH environment variable is still respected as a fallback but is now overridden by variant when both are set.

Changed

  • GEEProvider initialization now prefers an explicit Google Cloud project when one is provided via project=..., EE_PROJECT, or GOOGLE_CLOUD_PROJECT, but no longer hard-requires rs-embed callers to pass one explicitly. When no project is supplied, rs-embed now lets ee.Initialize() and geemap.ee_initialize() resolve Earth Engine's configured default project first, while still surfacing a clear error message when authentication is missing or no usable Cloud/quota project is configured.
  • thor now defaults RS_EMBED_THOR_PATCH_SIZE to 8 instead of 16, increasing the default token-grid density while keeping RS_EMBED_THOR_IMG=288. THOR also now defaults to a bounded native_snap preprocessing policy for ordinary resize-mode inputs: near-square inputs can keep a snapped native side when they stay within configured side/token limits, while tiled inputs still force fixed per-tile resize so stitched grids remain geometrically stable. The THOR model page now documents the patch-size/image-size coupling, native-snap limits, and concrete environment-variable examples for common tuning patterns.

Fixed

  • anysat and satmaepp_s2_10b now validate the variant keyword in a single place instead of accepting a wider alias set in _normalize_*_variant and then raising a second ModelError deeper in the runtime resolver. Previously, passing variant="tiny" / "small" to anysat or variant="base" to satmaepp_s2_10b would first be silently normalized and then rejected with a confusingly-located "currently exposes only variant=..." error. The normalize helpers now only accept the variants that actually map to a wired checkpoint (anysatbase, satmaepp_s2_10blarge), raise an immediate and descriptive ModelError for anything else, and the duplicate runtime guards have been removed. The satmaepp_s2_10b env-var path (RS_EMBED_SATMAEPP_S2_MODEL_FN) now also raises a clear error for unknown model_fn values instead of silently producing a variant=None runtime config. The describe() output for both adapters already advertises choices: ["base"] / choices: ["large"], so the schema side was already correct; this fix just makes the validation code match it.
  • BBox.validate() now enforces geographic bounds: longitudes must be in [-180, 180] and latitudes in [-90, 90]. Out-of-range coordinates previously passed validation and caused confusing downstream errors from the GEE provider.
  • describe_model() now returns a cached copy of the embedder's describe() output instead of instantiating a new embedder class on every call. The cache is keyed by canonical model name and is cleared by reset_runtime(). The returned dict is always a shallow copy so callers cannot mutate the cached entry.
  • fetch_api_side_inputs() now wraps per-spatial fetch errors in a ModelError that includes the spatial index and the original exception, making it easier to pinpoint which location caused a failure when running get_embeddings_batch() with input_prep="tile" or "auto".
  • run_embedding_request() now uses strict=True in the zip of spatials and prefetched inputs. A length mismatch between the two lists now raises immediately instead of silently truncating the result.
  • Loading checkpoint arrays during combined-export resume now emits a warnings.warn instead of silently swallowing the exception. Users will see a clear message indicating that array loading failed and that all inputs will be re-fetched.
  • _write_per_item_chunk no longer accesses the private _shutdown attribute of ThreadPoolExecutor to guard against double-shutdown. The outer finally block now relies on the documented idempotency of ThreadPoolExecutor.shutdown() instead, removing a fragile dependency on CPython internals that could break on future Python versions.
  • sensor_key() no longer applies int() truncation to scale_m and cloudy_pct when building the embedder instance cache key. Previously, float values such as 10.1 and 10.9 were both mapped to 10, allowing two sensors with different resolutions to share a cached embedder instance incorrectly. The raw field values are now used directly.
  • _run_per_item now closes all progress bars (main and per-model) inside the finally block of the chunk-pipeline loop. Previously the cleanup ran after the try/finally, so an unhandled exception (e.g. continue_on_error=False) would leave progress bars open and leak display resources in notebook environments.

v0.1.2

03 Apr 06:09

Choose a tag to compare

This release rolls up upstream-alignment work and correctness fixes that may change default embedding behavior for some model adapters compared with 0.1.1. Users who need strict reproducibility across versions should review the model-specific changes below and pin explicit options where needed.

Changed

  • Standardised NumPy docstrings across all public functions and classes in export.py, inspect.py, writers.py, and the tools/ and providers/ layers. No behaviour changes.

Added

  • Versioned documentation with a version selector powered by mike and MkDocs Material. Each release tag deploys a pinned version; pushes to main update a dev alias. The mike, mkdocs-material, and pymdown-extensions packages are now included in the [dev] optional group.

  • load_export(path) reader API that loads any export produced by export_batch(...) — both combined (single file) and per-item (directory) layouts — and returns a structured ExportResult. Failed points are NaN-filled rather than dropped, partial model runs are surfaced via status="partial", and ExportResult.embedding(model) provides a typed shortcut to the embedding array.

Changed

  • anysat now defaults grid output to native dense features while keeping pooled output on patch-grid pooling by default, with new AnySat-specific switches for grid_feature_mode (dense/patch) and pooled_source (patch/tile).
  • galileo now aligns more closely with the upstream NASA Harvest runtime: grid output prefers Galileo's own patch-level token averaging path, automatic NDVI derivation has been removed, and the default normalization mode is now none with an official_stats option for upstream pretraining statistics.
  • satvision_toa now uses the vendored official SatVision runtime as its only model path and narrows provider-side preprocessing to the default MODIS proxy route (MOD09GA reflectance + MOD21A1D thermal proxy). Custom collections are no longer treated as implicit GEE fallbacks; callers should pass calibrated input_chw directly for non-default inputs.

Fixed

  • device="auto" now correctly selects MPS on Apple Silicon instead of silently falling back to CPU. Follows the PyTorch-recommended priority (cuda > mps > cpu), giving an approximately 4x speedup on Apple M-series hardware for all API calls that use the default device.
  • galileo month overrides now use the official zero-based month indexing expected by Galileo embeddings, fixing the previous one-month offset in RS_EMBED_GALILEO_MONTH.
  • satvision_toa grid output now consistently extracts spatial features from the official-style SwinV2 path instead of misinterpreting pooled vectors as token grids, and the tightened fetch path records explicit proxy provenance in metadata.
  • scalemae now follows the official feature-extraction path more closely: the adapter unwraps common rshf wrappers to call the nested ScaleMAE backbone's forward_features(...) instead of falling back to wrapper forward(), uses ImageNet eval preprocessing (Resize(short side) + CenterCrop + Normalize) with effective post-preprocess input_res_m, and fixes the declarative input metadata to reflect raw Sentinel-2 SR inputs plus adapter-managed preprocessing.
  • satmaepp and satmaepp_s2_10b now align more closely with the official SatMAE++ preprocessing paths. The RGB adapter now defaults to rgb channel order for the published fMoW-RGB checkpoint and no longer pre-resizes provider/input overrides before the official eval transform, while the Sentinel-2 10-band adapter no longer sanitizes inputs with adapter-side clip / nan_to_num before the source-style SentinelNormalize -> ToTensor -> Resize(short side) -> CenterCrop pipeline.

v0.1.1

01 Apr 18:34

Choose a tag to compare

Added

  • Automated pull request changelog enforcement with a skip-changelog escape hatch for docs, tests, CI, and other internal-only changes.
  • Tag-driven GitHub Release publishing that uses the matching CHANGELOG.md section as the release notes.
  • Trusted Publishing release automation for PyPI and TestPyPI, including a manual TestPyPI dry run and install smoke test.

Changed

  • The contribution and release workflow now treats CHANGELOG.md as the canonical source for user-visible release notes.
  • The tag-triggered release flow now validates src/rs_embed/_version.py, publishes to PyPI, and only then creates the GitHub Release.
  • The tag-triggered release flow now validates the matching CHANGELOG.md entry before publishing to PyPI, so a missing release-notes section fails early instead of after package upload.
  • The base package installation now includes the Copernicus GeoTIFF runtime (tifffile and imagecodecs) instead of requiring a separate extra.
  • The public docs now default to pip install rs-embed for published releases while keeping editable installs documented for repository development.

Deprecated

Removed

Fixed

  • The TestPyPI smoke test now verifies package importability and the rs-embed CLI entry point, not just installability and version metadata.