Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/source/ExamplesDocDBRestApi.rst
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,8 @@ It's possible to attach a custom Session to retry certain requests errors:
Updating Metadata
~~~~~~~~~~~~~~~~~~~~~~

1. **Permissions**: Request permissions for AWS Credentials to write to DocDB through the API Gateway.
1. **Permissions**: Request permissions for AWS Credentials to write to DocDB through the API Gateway.
Note that the asset de/registration endpoints are intended for administrative use and require elevated AWS credentials/permissions.
2. **Query DocDB**: Filter for the records you want to update.
3. **Update DocDB**: Use ``upsert_one_docdb_record`` or ``upsert_list_of_docdb_records`` to update the records.

Expand Down
58 changes: 53 additions & 5 deletions src/aind_data_access_api/document_db.py
Original file line number Diff line number Diff line change
Expand Up @@ -631,7 +631,9 @@ def _add_qc_evaluation_url(self) -> str:
return f"https://{self.host}/{self.version}/add_qc_evaluation"

def generate_data_summary(self, record_id: str) -> Dict[str, Any]:
"""Get an LLM-generated summary for a data asset."""
"""
Get an LLM-generated summary for a data asset with the given record id.
"""
url = f"{self._data_summary_url}/{record_id}"
signed_header = self._signed_request(method="GET", url=url)
response = self.session.get(
Expand All @@ -641,8 +643,21 @@ def generate_data_summary(self, record_id: str) -> Dict[str, Any]:
return response.json()

def register_asset(self, s3_location: str) -> Dict[str, Any]:
"""Register a data asset to Code Ocean and the DocDB metadata index."""
"""
Register a data asset to Code Ocean and add its metadata to DocDB
given the metadata exists at the top level of the provided S3 location.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the de/registration endpoints are intended for admin, should we mention that it requires elevated AWS credentials?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although pretty much all endpoints other than the public READ ones require some sort of elevated permissions, so I'm also ok to leave it out. Up to you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like that context would be great to clarify in the readthedocs somewhere, probably don't need it in every docstring

Copy link
Collaborator

@helen-m-lin helen-m-lin Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea- Maybe somewhere in the User Guide or Rest API examples? This section briefly mentions that you need AWS credentials for updating metadata, but maybe we can expand this: https://aind-data-access-api.readthedocs.io/en/latest/ExamplesDocDBRestApi.html#updating-metadata


Parameters
----------
s3_location : str
The S3 location containing the asset and its metadata.

Returns
-------
Dict[str, Any]
The response from the registration API, including registration
status and details.
"""
data = json.dumps({"s3_location": s3_location})
signed_header = self._signed_request(
method="POST", url=self._register_asset_url, data=data
Expand All @@ -662,8 +677,28 @@ def register_co_result(
co_asset_id: str,
co_computation_id: str,
) -> Dict[str, Any]:
"""Register a Code Ocean result asset to the DocDB metadata index."""
"""
Register a Code Ocean result asset and add its metadata to DocDB
given the metadata exists at the top level of the Code Ocean
computation result.

Parameters
----------
s3_location : str
The S3 location containing the result asset and its metadata.
name : str
The name of the result asset.
co_asset_id : str
The Code Ocean asset ID for the result.
co_computation_id : str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious what the computation ID is used for, doesn't seem necessary in addition to asset ID - and what happens if they don't match (asset is generated from a different computation ID)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The co_computation_id is used to download the metadata using Code Ocean's list_computation_results API. We do a sanity check that the co_asset_id exists and has provenance attributes that match the co_computation_id.

In retrospect maybe we could have done this with just s3_location and co_asset_id... I think this was originally due a mix of reasons: ease of using the Code Ocean API, access control, and confirming that it is in fact a result asset.

The Code Ocean computation ID associated with the result.

Returns
-------
Dict[str, Any]
The response from the registration API, including registration
status and details.
"""
data = json.dumps(
{
"s3_location": s3_location,
Expand All @@ -684,9 +719,22 @@ def register_co_result(
return response.json()

def deregister_asset(self, s3_location: str) -> Dict[str, Any]:
"""De-register (delete) a data asset in Code Ocean and the
DocDB metadata index."""
"""
De-register (delete) a data asset from Code Ocean and remove its
metadata from DocDB given that the asset and its metadata are located
at the provided S3 location.

Parameters
----------
s3_location : str
The S3 location containing the asset and metadata to be removed.

Returns
-------
Dict[str, Any]
The response from the deregistration API, including deregistration
status and details.
"""
data = json.dumps({"s3_location": s3_location})
signed_header = self._signed_request(
method="DELETE", url=self._deregister_asset_url, data=data
Expand Down