Releases: cybergis/rs-embed
v0.1.3
Added
wildsatnow supportsvariantselection viamodel_config={"variant": "..."}(or thevariant=keyword). Available variants arevitb16(default),resnet50, andswint, each backed by its own ImageNet-pretrained checkpoint that auto-downloads from Google Drive. The previousvitl16arch has been removed as no upstream checkpoint exists. Users who need other pre-training initializations (CLIP, Prithvi, SatCLIP, etc.) can still pointRS_EMBED_WILDSAT_CKPTat a local checkpoint. TheRS_EMBED_WILDSAT_ARCHenvironment variable is still respected as a fallback but is now overridden byvariantwhen both are set.
Changed
GEEProviderinitialization now prefers an explicit Google Cloud project when one is provided viaproject=...,EE_PROJECT, orGOOGLE_CLOUD_PROJECT, but no longer hard-requires rs-embed callers to pass one explicitly. When no project is supplied, rs-embed now letsee.Initialize()andgeemap.ee_initialize()resolve Earth Engine's configured default project first, while still surfacing a clear error message when authentication is missing or no usable Cloud/quota project is configured.thornow defaultsRS_EMBED_THOR_PATCH_SIZEto8instead of16, increasing the default token-grid density while keepingRS_EMBED_THOR_IMG=288. THOR also now defaults to a boundednative_snappreprocessing policy for ordinary resize-mode inputs: near-square inputs can keep a snapped native side when they stay within configured side/token limits, while tiled inputs still force fixed per-tile resize so stitched grids remain geometrically stable. The THOR model page now documents the patch-size/image-size coupling, native-snap limits, and concrete environment-variable examples for common tuning patterns.
Fixed
anysatandsatmaepp_s2_10bnow validate thevariantkeyword in a single place instead of accepting a wider alias set in_normalize_*_variantand then raising a secondModelErrordeeper in the runtime resolver. Previously, passingvariant="tiny"/"small"toanysatorvariant="base"tosatmaepp_s2_10bwould first be silently normalized and then rejected with a confusingly-located "currently exposes only variant=..." error. The normalize helpers now only accept the variants that actually map to a wired checkpoint (anysat→base,satmaepp_s2_10b→large), raise an immediate and descriptiveModelErrorfor anything else, and the duplicate runtime guards have been removed. Thesatmaepp_s2_10benv-var path (RS_EMBED_SATMAEPP_S2_MODEL_FN) now also raises a clear error for unknownmodel_fnvalues instead of silently producing avariant=Noneruntime config. Thedescribe()output for both adapters already advertiseschoices: ["base"]/choices: ["large"], so the schema side was already correct; this fix just makes the validation code match it.BBox.validate()now enforces geographic bounds: longitudes must be in[-180, 180]and latitudes in[-90, 90]. Out-of-range coordinates previously passed validation and caused confusing downstream errors from the GEE provider.describe_model()now returns a cached copy of the embedder'sdescribe()output instead of instantiating a new embedder class on every call. The cache is keyed by canonical model name and is cleared byreset_runtime(). The returned dict is always a shallow copy so callers cannot mutate the cached entry.fetch_api_side_inputs()now wraps per-spatial fetch errors in aModelErrorthat includes the spatial index and the original exception, making it easier to pinpoint which location caused a failure when runningget_embeddings_batch()withinput_prep="tile"or"auto".run_embedding_request()now usesstrict=Truein thezipof spatials and prefetched inputs. A length mismatch between the two lists now raises immediately instead of silently truncating the result.- Loading checkpoint arrays during combined-export resume now emits a
warnings.warninstead of silently swallowing the exception. Users will see a clear message indicating that array loading failed and that all inputs will be re-fetched. _write_per_item_chunkno longer accesses the private_shutdownattribute ofThreadPoolExecutorto guard against double-shutdown. The outerfinallyblock now relies on the documented idempotency ofThreadPoolExecutor.shutdown()instead, removing a fragile dependency on CPython internals that could break on future Python versions.sensor_key()no longer appliesint()truncation toscale_mandcloudy_pctwhen building the embedder instance cache key. Previously, float values such as10.1and10.9were both mapped to10, allowing two sensors with different resolutions to share a cached embedder instance incorrectly. The raw field values are now used directly._run_per_itemnow closes all progress bars (main and per-model) inside thefinallyblock of the chunk-pipeline loop. Previously the cleanup ran after thetry/finally, so an unhandled exception (e.g.continue_on_error=False) would leave progress bars open and leak display resources in notebook environments.
v0.1.2
This release rolls up upstream-alignment work and correctness fixes that may change default embedding behavior for some model adapters compared with 0.1.1. Users who need strict reproducibility across versions should review the model-specific changes below and pin explicit options where needed.
Changed
- Standardised NumPy docstrings across all public functions and classes in
export.py,inspect.py,writers.py, and thetools/andproviders/layers. No behaviour changes.
Added
-
Versioned documentation with a version selector powered by
mikeand MkDocs Material. Each release tag deploys a pinned version; pushes tomainupdate adevalias. Themike,mkdocs-material, andpymdown-extensionspackages are now included in the[dev]optional group. -
load_export(path)reader API that loads any export produced byexport_batch(...)— both combined (single file) and per-item (directory) layouts — and returns a structuredExportResult. Failed points are NaN-filled rather than dropped, partial model runs are surfaced viastatus="partial", andExportResult.embedding(model)provides a typed shortcut to the embedding array.
Changed
anysatnow defaultsgridoutput to nativedensefeatures while keeping pooled output on patch-grid pooling by default, with new AnySat-specific switches forgrid_feature_mode(dense/patch) andpooled_source(patch/tile).galileonow aligns more closely with the upstream NASA Harvest runtime:gridoutput prefers Galileo's own patch-level token averaging path, automatic NDVI derivation has been removed, and the default normalization mode is nownonewith anofficial_statsoption for upstream pretraining statistics.satvision_toanow uses the vendored official SatVision runtime as its only model path and narrows provider-side preprocessing to the default MODIS proxy route (MOD09GAreflectance +MOD21A1Dthermal proxy). Custom collections are no longer treated as implicit GEE fallbacks; callers should pass calibratedinput_chwdirectly for non-default inputs.
Fixed
device="auto"now correctly selects MPS on Apple Silicon instead of silently falling back to CPU. Follows the PyTorch-recommended priority (cuda > mps > cpu), giving an approximately 4x speedup on Apple M-series hardware for all API calls that use the default device.galileomonth overrides now use the official zero-based month indexing expected by Galileo embeddings, fixing the previous one-month offset inRS_EMBED_GALILEO_MONTH.satvision_toagridoutput now consistently extracts spatial features from the official-style SwinV2 path instead of misinterpreting pooled vectors as token grids, and the tightened fetch path records explicit proxy provenance in metadata.scalemaenow follows the official feature-extraction path more closely: the adapter unwraps commonrshfwrappers to call the nested ScaleMAE backbone'sforward_features(...)instead of falling back to wrapperforward(), uses ImageNet eval preprocessing (Resize(short side) + CenterCrop + Normalize) with effective post-preprocessinput_res_m, and fixes the declarative input metadata to reflect raw Sentinel-2 SR inputs plus adapter-managed preprocessing.satmaeppandsatmaepp_s2_10bnow align more closely with the official SatMAE++ preprocessing paths. The RGB adapter now defaults torgbchannel order for the published fMoW-RGB checkpoint and no longer pre-resizes provider/input overrides before the official eval transform, while the Sentinel-2 10-band adapter no longer sanitizes inputs with adapter-sideclip/nan_to_numbefore the source-styleSentinelNormalize -> ToTensor -> Resize(short side) -> CenterCroppipeline.
v0.1.1
Added
- Automated pull request changelog enforcement with a
skip-changelogescape hatch for docs, tests, CI, and other internal-only changes. - Tag-driven GitHub Release publishing that uses the matching
CHANGELOG.mdsection as the release notes. - Trusted Publishing release automation for PyPI and TestPyPI, including a manual TestPyPI dry run and install smoke test.
Changed
- The contribution and release workflow now treats
CHANGELOG.mdas the canonical source for user-visible release notes. - The tag-triggered release flow now validates
src/rs_embed/_version.py, publishes to PyPI, and only then creates the GitHub Release. - The tag-triggered release flow now validates the matching
CHANGELOG.mdentry before publishing to PyPI, so a missing release-notes section fails early instead of after package upload. - The base package installation now includes the Copernicus GeoTIFF runtime (
tifffileandimagecodecs) instead of requiring a separate extra. - The public docs now default to
pip install rs-embedfor published releases while keeping editable installs documented for repository development.
Deprecated
Removed
Fixed
- The TestPyPI smoke test now verifies package importability and the
rs-embedCLI entry point, not just installability and version metadata.