datasets-service

A serverless AWS Lambda service that provides dataset management endpoints for the Pennsieve platform. This service handles dataset-related operations including trashcan management and dataset manifest generation.

Service Overview

The datasets-service is a Go-based serverless application deployed as an AWS Lambda function. It connects to a PostgreSQL database through RDS Proxy and integrates with S3 for manifest storage and SNS for asynchronous processing.

Endpoints

`/datasets/trashcan`

Method: GET
Description: Retrieves paginated list of deleted items from a dataset's trashcan
Authentication: Requires ViewFiles permission
Query Parameters:

dataset_id (required): The dataset node ID
root_node_id (optional): Filter by root node/folder ID
limit (optional): Number of items per page (default: 10, max: 100)
offset (optional): Pagination offset (default: 0)

Response: Returns a paginated list of trashcan items including package ID, name, node ID, type, and deletion state.

`/datasets/manifest`

Method: GET
Description: Generates and retrieves a dataset manifest containing metadata about all files in the dataset
Authentication: Requires ViewFiles permission
Query Parameters:

dataset_id (required): The dataset node ID

Response: Returns a manifest with dataset metadata and file information including:

Dataset details (name, description, license, tags, contributors)
File paths and metadata (node IDs, file names, sizes, checksums)
Manifest is stored in S3 with a presigned URL for download

Architecture

Runtime: Go with AWS Lambda (ARM64 architecture)
Database: PostgreSQL via RDS Proxy
Storage: S3 for manifest files
Infrastructure: Terraform for IaC
VPC: Deployed in private subnets with security group configuration

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
.github/workflows		.github/workflows
api		api
lambda/service		lambda/service
terraform		terraform
.gitignore		.gitignore
Dockerfile.test		Dockerfile.test
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.test.yml		docker-compose.test.yml
dockertest.env		dockertest.env
localtest.env		localtest.env
run-test-coverage.sh		run-test-coverage.sh
run-tests.sh		run-tests.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

datasets-service

Service Overview

Endpoints

`/datasets/trashcan`

`/datasets/manifest`

Architecture

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

Pennsieve/datasets-service

Folders and files

Latest commit

History

Repository files navigation

datasets-service

Service Overview

Endpoints

/datasets/trashcan

/datasets/manifest

Architecture

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

`/datasets/trashcan`

`/datasets/manifest`

Packages