Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions .bob/rules-advance/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Project Advance Coding Rules (Non-Obvious Only)

**Copyright notice MANDATORY:** All Python files, including empty ones (`__init__.py`), must start with the same copyright notice (as can be seen by reading any of them).

## Import Requirements

**Future annotations MANDATORY:** Every Python file MUST have `from __future__ import annotations` as line 15 (first import after copyright header). This is enforced project-wide for Python 3.8+ compatibility.

## Custom Utilities (Must Use)

**Unset sentinel:** In most cases, use `_UNSET` from `astrapy.utils.unset` instead of `None` for optional parameters. This singleton distinguishes "not provided" from "explicitly None" - critical for API parameter handling where the semantics is different.

**Decimal handling:** JSON de/serialization for API responses/payloads is customized in `astrapy/utils/api_commander.py` to respect the Decimal serialization down to all its significant digits. See the "decimal-aware" parse/encode methods of the `APICommander` class. This interacts non-trivially with the actual data type the user sees when reading (e.g. floats), which is configured through the "SerdesOptions".

**String enums:** Use custom `StrEnum` from `astrapy/utils/str_enum.py` instead of standard library Enum. It has special `_name_lookup` method for case-insensitive lookups.

## Type Hints (Strict)

**Mypy strict mode:** All functions require complete type hints. No implicit reexports, no untyped calls, no untyped decorators. This is enforced by pyproject.toml.

**Forward references:** Use `from __future__ import annotations` to enable forward references without quotes.

## Testing Patterns

**Async/sync naming:** Test files MUST end with `_async` or `_sync` suffix. This is a strict organizational convention, not optional.

**Blockbuster fixture:** All tests automatically use `blockbuster` fixture (autouse=True) to detect blocking I/O in async code. Specific exceptions are allowed in `tests/conftest.py` lines 86-89.

**Environment variables required:** Even unit tests require environment variables from `tests/env_templates/*.base.template`. Tests will fail without them.

## Non-Standard Patterns

**Circular import workaround:** `astrapy/__init__.py` imports Database/Table at the END of the module (lines 62-63) to avoid circular import issues. Don't move these imports.

**Docker Compose cleanup:** If `DOCKER_COMPOSE_LOCAL_DATA_API="yes"`, tests auto-start Docker Compose via `tests/preprocess_env.py` but do NOT clean it up automatically.

## MCP and Browser Access

This mode has access to MCP tools and browser capabilities for enhanced development workflows.
31 changes: 31 additions & 0 deletions .bob/rules-ask/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Project Documentation Rules (Non-Obvious Only)

## Project Structure

**Test organization:** Tests are split into three groups - "base" (general functionality, unit/integration split), "vectorize" (extensive provider testing), and "admin" (admin operations). CI only runs "base" tests.

**Test targets:** Tests can run against three different Data API targets: Astra (cloud), Local (user-supplied HCD/DSE), or DockerCompose (auto-started by tests). Each requires different environment variables from `tests/env_templates/`.

**Python version caveat:** A few integration tests (those that access the database through direct CQL connection) don't work on Python 3.13+ due to cassandra-driver dependency issues (libev/asyncore), and are therefore skipped on newer Python; but the package itself supports Python 3.13+.

## Module Organization

**Circular import pattern:** `astrapy/__init__.py` imports Database/Table at the END to avoid circular dependencies - this is intentional and documented in comments.

**Data types location:** Custom data types (DataAPIVector, DataAPIMap, DataAPISet, DataAPIDate, etc.) are in `astrapy/data_types/` - these wrap/augment/replace standard Python types with special serialization for the Data API, lift range limitations (e.g. for `datetime`) and provide the very same behaviour as the Data API needs.

**Utils organization:** `astrapy/utils/` contains critical utilities like `_UNSET` sentinel, custom `StrEnum`, Decimal encoders, and API commander. `astrapy/data/utils/` has data-specific converters.

## Testing Environment

**Environment variables mandatory:** Even unit tests require environment variables from `tests/env_templates/*.base.template` - tests will fail without them, not skip.

**Docker Compose behavior:** If `DOCKER_COMPOSE_LOCAL_DATA_API="yes"`, tests auto-start Docker Compose but do NOT clean it up automatically - manual cleanup required.

**Blockbuster fixture:** All tests use `blockbuster` fixture (autouse=True) to detect blocking I/O in async code - specific exceptions allowed in `tests/conftest.py`.

## Build System

**Package manager:** Uses `uv` for dependency management and virtual environments, not pip or poetry.

**Coverage tracking:** Unit and integration tests write separate coverage files (`.coverage.unit`, `.coverage.integration`) - combine with `make coverage`.
33 changes: 33 additions & 0 deletions .bob/rules-code/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Project Coding Rules (Non-Obvious Only)

**Copyright notice MANDATORY:** All Python files, including empty ones (`__init__.py`), must start with the same copyright notice (as can be seen by reading any of them).

## Import Requirements

**Future annotations MANDATORY:** Every Python file MUST have `from __future__ import annotations` as line 15 (first import after copyright header). This is enforced project-wide for Python 3.8+ compatibility. It also enables forward-references without quotes for typing.

## Custom Utilities (Must Use)

**Unset sentinel:** In most cases, use `_UNSET` from `astrapy.utils.unset` instead of `None` for optional parameters. This singleton distinguishes "not provided" from "explicitly None" - critical for API parameter handling in almost all cases.

**Decimal handling:** JSON de/serialization for API responses/payloads is customized in `astrapy/utils/api_commander.py` to respect the Decimal serialization down to all its significant digits. See the "decimal-aware" parse/encode methods of the `APICommander` class. This interacts non-trivially with the actual data type the user sees when reading (e.g. floats), which is configured through the "SerdesOptions".

**String enums:** Use custom `StrEnum` from `astrapy/utils/str_enum.py` instead of standard library Enum. It has special `_name_lookup` method for case-insensitive lookups.

## Type Hints (Strict)

**Mypy strict mode:** All functions require complete type hints. No implicit reexports, no untyped calls, no untyped decorators. Check pyproject.toml for details.

## Testing Patterns

**Async/sync naming:** Test files MUST end with `_async` or `_sync` suffix. This is a strict organizational convention, not optional.

**Blockbuster fixture:** All tests automatically use `blockbuster` fixture (autouse=True) to detect blocking I/O in async code. Specific exceptions are allowed in `tests/conftest.py` lines 86-89.

**Environment variables required:** Even unit tests require environment variables from `tests/env_templates/*.base.template`. Tests will fail without them.

## Non-Standard Patterns

**Circular import workaround:** `astrapy/__init__.py` imports Database/Table at the END of the module (lines 62-63) to avoid circular import issues. Don't move these imports.

**Docker Compose cleanup:** If `DOCKER_COMPOSE_LOCAL_DATA_API="yes"`, tests auto-start Docker Compose via `tests/preprocess_env.py` but do NOT clean it up automatically.
33 changes: 33 additions & 0 deletions .bob/rules-plan/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Project Architecture Rules (Non-Obvious Only)

## Core Architecture

**Dual API pattern:** Project provides both sync and async versions of all major classes (Collection/AsyncCollection, Database/AsyncDatabase, Table/AsyncTable) - they share similar interfaces but are separate implementations (i.e. no thread-wrapping for faux async, ever).

**Circular dependency resolution:** `astrapy/__init__.py` imports Database/Table at the END of the module to break circular dependencies between client, database, and collection/table classes.

## Data Flow

**Unset vs None distinction:** The `_UNSET` sentinel from `astrapy.utils.unset` is used throughout to distinguish "parameter not provided" from "explicitly set to None" - critical for API calls where omitting a parameter has different semantics than passing None.

**Decimal handling:** JSON de/serialization for API responses/payloads is customized in `astrapy/utils/api_commander.py` to respect the Decimal serialization down to all its significant digits. See the "decimal-aware" parse/encode methods of the `APICommander` class. This interacts non-trivially with the actual data type the user sees when reading (e.g. floats), which is configured through the "SerdesOptions".

**Type conversion layers:** `astrapy/data/utils/table_converters.py` contains complex type conversion logic for Table operations - handles Python types to/from Data API representations including special handling for vectors, maps, sets, UDTs, dates, times, timestamps.

**Return type classes:** Classes that represent (part of) Data API responses have the same shape as the response itself. Their standardized `_from_dict` method MUST raise a warning when it encounters unexpected additional fields, but still work (there are utilities for that). Also, generally a side field is also stored with the original input dict (`raw_response`, `raw_input` or similar).

## Testing Architecture

**Three test groups:** Tests organized as "base" (CI-tested general functionality), "vectorize" (manual provider testing), and "admin" (manual admin operations). Only "base" runs in CI.

**Environment-dependent behavior:** Tests adapt to three different targets (Astra/Local/DockerCompose) based on environment variables - different features available on each.

**Blockbuster integration:** All tests use `blockbuster` fixture (autouse=True) to detect blocking I/O in async code - prevents accidental blocking calls in async contexts.

## Non-Standard Patterns

**StrEnum with lookup:** Custom `StrEnum` class in `astrapy/utils/str_enum.py` has special `_name_lookup` method for case-insensitive enum lookups - not available in standard library Enum.

**Docker Compose lifecycle:** If `DOCKER_COMPOSE_LOCAL_DATA_API="yes"`, tests auto-start Docker Compose via `tests/preprocess_env.py` but do NOT clean it up - intentional for debugging.

**Python version caveat:** A few integration tests (those that access the database through direct CQL connection) don't work on Python 3.13+ due to cassandra-driver dependency issues (libev/asyncore), and are therefore skipped on newer Python; but the package itself supports Python 3.13+.
1 change: 0 additions & 1 deletion .github/workflows/codecov_aggregator.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,6 @@ jobs:
ASTRA_DB_APPLICATION_TOKEN: ${{ secrets.ASTRA_DB_APPLICATION_TOKEN }}
ASTRA_DB_API_ENDPOINT: ${{ secrets.ASTRA_DB_API_ENDPOINT }}
HEADER_EMBEDDING_API_KEY_OPENAI: ${{ secrets.HEADER_EMBEDDING_API_KEY_OPENAI }}
LEGACY_INSERTMANY_BEHAVIOUR_PRE2193: "yes"

local_it:
uses: ./.github/workflows/local.yml
Expand Down
2 changes: 0 additions & 2 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,6 @@ on:
required: true
HEADER_EMBEDDING_API_KEY_OPENAI:
required: true
LEGACY_INSERTMANY_BEHAVIOUR_PRE2193:
required: true

jobs:
test:
Expand Down
59 changes: 59 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# AGENTS.md

This file provides guidance to agents when working with code in this repository.

## Build/Test Commands

**Run single test file:**
```bash
uv run pytest tests/base/unit/test_exceptions.py -v
```

**Run single test by name pattern:**
```bash
uv run pytest tests/base/integration -k test_collection_count_documents_sync -v
```

**Run tests in specific directory:**
```bash
uv run pytest tests/base/unit -v # Unit tests only
uv run pytest tests/base/integration -v # Integration tests only
```

**Environment setup required:** Tests require environment variables from `tests/env_templates/*.base.template` even for unit tests. Integration tests also need `env.vectorize-minimal.template` variables.

**Regularly-run vs. manual tests:** Tests in `tests/base` are part of the "always-run" process, whereas admin and vectorize tests (long and cumbersome) are manually run, outside of Github CI/CD, only when changes call for it or new API versions are deployed.

## Code Style (Non-Obvious)

**Import order:** Always use `from __future__ import annotations` as the FIRST import after copyright header (line 15 in all files).

**Unset sentinel:** Use `_UNSET` from `astrapy.utils.unset` instead of `None` for optional parameters that need to distinguish "not provided" from "explicitly None".

**Type hints:** Project uses Python 3.8+ with `from __future__ import annotations` for forward references. All functions must have complete type hints (enforced by mypy strict mode).

**Decimal handling:** JSON de/serialization for API responses/payloads is customized in `astrapy/utils/api_commander.py` to respect the Decimal serialization down to all its significant digits. See the "decimal-aware" parse/encode methods of the `APICommander` class. This interacts non-trivially with the actual data type the user sees when reading (e.g. floats), which is configured through the "SerdesOptions".

**StrEnum pattern:** Custom `StrEnum` class in `astrapy/utils/str_enum.py` with special `_name_lookup` - use this instead of standard Enum for string enums.

**Test environment detection:** Tests use `tests/preprocess_env.py` which auto-starts Docker Compose if `DOCKER_COMPOSE_LOCAL_DATA_API="yes"` - this is NOT cleaned up automatically.

**Python 3.13+ limitation:** Integration tests don't work (and are hence skipped) on Python 3.13+ due to cassandra-driver dependency issues (libev/asyncore), but the package itself supports 3.13.

## Formatting

**Linter:** Uses `ruff` with specific rules as per pyproject.toml.

**Auto-fix:** Run `make format-fix` to auto-fix imports and style issues before committing.

**Type checking:** Strict mypy configuration - all functions require type hints, no implicit reexports, no untyped calls/decorators. Refer to `make check` and settings in pyproject.toml for the whole story.

**Checks before opening a PR:** Manually check that `make format` passes before submitting a PR. It will check style, linter and typing in one go for the library and the tests.

## Testing Gotchas

**Blockbuster fixture:** All tests use `blockbuster` fixture (autouse=True) to detect blocking I/O - allows specific exceptions in `tests/conftest.py`.

**Coverage files:** Unit and integration tests write separate coverage files (`.coverage.unit`, `.coverage.integration`) - combine with `make coverage`.

**Test naming:** Async tests end with `_async`, sync tests end with `_sync` - this is a strict convention for test file organization.
3 changes: 2 additions & 1 deletion CHANGES
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,11 @@ SSL bugfix due to older patches of Python 3.12:
Removed the unused 'keyspace' parameter from `[async_]fetch_database_info` admin utility function
Removed the beta marker from find_and_rerank methods
Removed the legacy no-indexType fallback for parsing the response of list_indexes
Added AGENTS.md (and the .bob per-mode equivalent files)
maintenance: measure test coverage for PRs and merges to main; calculate delta for PRs
maintenance: bump urllib3 to >= 2.6.3 to avoid CVE-2026-21441
maintenance: add flag `TEST_EXTENDED_VECTORIZE` for test runnability in all cases
maintenance: introduce `LEGACY_INSERTMANY_BEHAVIOUR_PRE2193` flag to control related tests
maintenance: introduce `LEGACY_INSERTMANY_BEHAVIOUR_PRE2193` flag to control related tests (former ordered-insertion-for-tables behaviour)
maintenance: enabled text-index integration tests on Astra
maintenance: refactor `constants.py` for better management; add `DatabaseStatus` enum
maintenance: introduced the publish-and-release workflow machinery
Expand Down
6 changes: 3 additions & 3 deletions tests/base/integration/tables/test_table_dml_async.py
Original file line number Diff line number Diff line change
Expand Up @@ -401,7 +401,7 @@ def _assert_tim_exceptions(exps: Sequence[Exception], count: int) -> None:
await async_table_simple.insert_many(
SIMPLE_SEVEN_ROWS_F2, ordered=True, chunk_size=2
)
if "LEGACY_INSERTMANY_BEHAVIOUR_PRE2193" not in os.environ:
if not os.environ.get("LEGACY_INSERTMANY_BEHAVIOUR_PRE2193"):
await _assert_consistency(["p1"], exc.value)
else:
await _assert_consistency([], exc.value)
Expand All @@ -413,7 +413,7 @@ def _assert_tim_exceptions(exps: Sequence[Exception], count: int) -> None:
await async_table_simple.insert_many(
SIMPLE_SEVEN_ROWS_F4, ordered=True, chunk_size=2
)
if "LEGACY_INSERTMANY_BEHAVIOUR_PRE2193" not in os.environ:
if not os.environ.get("LEGACY_INSERTMANY_BEHAVIOUR_PRE2193"):
await _assert_consistency(["p1", "p2", "p3"], exc.value)
else:
await _assert_consistency(["p1", "p2"], exc.value)
Expand Down Expand Up @@ -557,7 +557,7 @@ async def test_table_insert_many_failures_async(
assert len(err4.exceptions) == 1
assert isinstance(err4.exceptions[0], DataAPIResponseException)
assert len(err4.exceptions[0].error_descriptors) == 1
if "LEGACY_INSERTMANY_BEHAVIOUR_PRE2193" not in os.environ:
if not os.environ.get("LEGACY_INSERTMANY_BEHAVIOUR_PRE2193"):
assert err4.inserted_id_tuples == [("n0",)]
else:
assert err4.inserted_id_tuples == []
Expand Down
6 changes: 3 additions & 3 deletions tests/base/integration/tables/test_table_dml_sync.py
Original file line number Diff line number Diff line change
Expand Up @@ -383,7 +383,7 @@ def _assert_tim_exceptions(exps: Sequence[Exception], count: int) -> None:
sync_table_simple.insert_many(
SIMPLE_SEVEN_ROWS_F2, ordered=True, chunk_size=2
)
if "LEGACY_INSERTMANY_BEHAVIOUR_PRE2193" not in os.environ:
if not os.environ.get("LEGACY_INSERTMANY_BEHAVIOUR_PRE2193"):
_assert_consistency(["p1"], exc.value)
else:
_assert_consistency([], exc.value)
Expand All @@ -395,7 +395,7 @@ def _assert_tim_exceptions(exps: Sequence[Exception], count: int) -> None:
sync_table_simple.insert_many(
SIMPLE_SEVEN_ROWS_F4, ordered=True, chunk_size=2
)
if "LEGACY_INSERTMANY_BEHAVIOUR_PRE2193" not in os.environ:
if not os.environ.get("LEGACY_INSERTMANY_BEHAVIOUR_PRE2193"):
_assert_consistency(["p1", "p2", "p3"], exc.value)
else:
_assert_consistency(["p1", "p2"], exc.value)
Expand Down Expand Up @@ -537,7 +537,7 @@ def test_table_insert_many_failures_sync(
assert len(err4.exceptions) == 1
assert isinstance(err4.exceptions[0], DataAPIResponseException)
assert len(err4.exceptions[0].error_descriptors) == 1
if "LEGACY_INSERTMANY_BEHAVIOUR_PRE2193" not in os.environ:
if not os.environ.get("LEGACY_INSERTMANY_BEHAVIOUR_PRE2193"):
assert err4.inserted_id_tuples == [("n0",)]
else:
assert err4.inserted_id_tuples == []
Expand Down
Loading