Skip to content

alialkhalidi/ecr-sync

Repository files navigation

ECR Sync

ECR sync script with support for container images and Helm charts.

Architecture Overview

1:1:1 Repository Model

Each repository contains:

  • Exactly one container image with multiple tags (optional)
  • Exactly one Helm chart with multiple tags (optional)
  • Both lifecycle and permission policies (optional)

Core Components

src/
├── models/                 # Data models
│   ├── registry.py        # AWS ECR registry representation
│   ├── repository.py      # Repository with 1:1:1 artifact model
│   └── artifacts.py       # Container image and Helm chart artifacts
├── runtime/               # Container runtime abstraction
│   ├── base.py           # Abstract runtime interface
│   └── podman_runtime.py # Podman implementation
├── inventory/             # Repository discovery
│   ├── filesystem.py     # YAML file loading
│   └── online.py         # AWS ECR querying
├── services/              # Business logic
│   ├── action_logger.py  # Action tracking and reporting
│   └── reconciliation.py # Core sync logic
├── config.py             # Configuration management
└── main.py               # Application entry point

Features

GitLab Dependency Proxy Support

The application automatically detects and uses GitLab's dependency proxy when running in GitLab CI environments to overcome Docker Hub rate limits. The dependency proxy feature:

  • Automatically activates when GitLab CI environment variables are detected
  • Intelligently rewrites Docker Hub URLs to use GitLab's dependency proxy
  • Authenticates seamlessly using GitLab CI tokens
  • Maintains compatibility with private registry images

Environment Variables

The dependency proxy is enabled when all of the following CI variables are present:

  • CI_DEPENDENCY_PROXY_USER - GitLab CI token username
  • CI_DEPENDENCY_PROXY_PASSWORD - GitLab CI token password
  • CI_DEPENDENCY_PROXY_SERVER - GitLab dependency proxy server
  • CI_DEPENDENCY_PROXY_DIRECT_GROUP_IMAGE_PREFIX - GitLab proxy URL prefix

URL Rewriting Examples

Original: grafana/loki:latest
Rewritten: gitlab.example.com/group/dependency_proxy/containers/grafana/loki:latest

Original: docker.io/library/nginx:alpine
Rewritten: gitlab.example.com/group/dependency_proxy/containers/library/nginx:alpine

Private registry images (e.g., registry.example.com/app:tag) are not rewritten and continue to use their original URLs.

How It Works

The application uses a probe-based reconciliation approach:

  1. Loads filesystem inventory from YAML configuration files
  2. Probes ECR on-demand for each repository defined in the filesystem inventory (instead of loading all ECR repositories)
  3. Compares and reconciles by creating missing repositories, updating policies, and syncing artifacts
  4. Automatically handles ECR authentication by managing authentication tokens and ensuring the container runtime is authenticated before push operations
  5. Provides detailed logging of all operations performed

This efficient approach reduces memory usage and improves performance by only querying ECR for repositories that are actually needed.

ECR Authentication

The application automatically manages ECR authentication:

  • Authentication tokens are obtained from AWS ECR using the get_authorization_token API
  • Token expiry tracking ensures tokens are refreshed before they expire (with 5-minute buffer)
  • Automatic login occurs before push operations (not required for pull or tag operations)
  • Username is always "AWS" and password is the ECR authorization token

No manual ECR login is required when using this application.

Configuration Format

YAML Structure

  - name: nginx-app                    # Required: repository name
    image:                             # Optional: exactly one image
      pull_url: "docker.io/library/nginx:latest"
      tags: ["latest", "1.21", "1.20"]
    chart:                             # Optional: exactly one chart
      pull_url: "https://charts.bitnami.com/bitnami/nginx"
      tags: ["15.0.2", "15.0.1"]
    lifecyclePolicy:                   # Optional: ECR lifecycle policy
      rules:
        - rulePriority: 1
          description: "Keep last 10 images"
          selection:
            tagStatus: "any"
            countType: "imageCountMoreThan"
            countNumber: 10
          action:
            type: "expire"
    permissionPolicy:                  # Optional: ECR repository permissions
      Version: "2012-10-17"
      Statement:
        - Sid: "AllowPull"
          Effect: "Allow"
          Principal:
            AWS: "arn:aws:iam::123456789012:root"
          Action:
            - "ecr:GetDownloadUrlForLayer"
            - "ecr:BatchGetImage"

Repository Types Supported

  1. Mixed Repositories: Both image and chart (1:1:1 model)
  2. Image-Only Repositories: Container images only
  3. Chart-Only Repositories: Helm charts only
  4. Empty Repositories: For future use

Helm Chart Repository Support

The system supports multiple types of Helm chart repositories:

  • HTTP-based repositories: Traditional Helm repositories with index.yaml
  • OCI-based repositories: Charts stored as OCI artifacts

Supported chart repository formats:

  • https://charts.bitnami.com/bitnami (HTTP)
  • oci://registry-1.docker.io/bitnamicharts (OCI)
  • oci://quay.io/jetstack (OCI)

Example configurations are available in images/helm-examples.yml.

Installation & Setup

Prerequisites

  • Python 3.8+
  • Podman or Docker
  • AWS credentials configured
  • Access to target ECR registry

Install Dependencies

pip install -r requirements.txt

Environment Variables

# Required
export ECR_ACCOUNT_ID="123456789012"
export ECR_REGION="us-east-1"

# Optional
export CONTAINER_RUNTIME="podman"                               # podman or docker (future)
export REPOSITORIES_DIR="./repositories"                        # Path to YAML files
export DRY_RUN="false"                                          # Set to 'true' for dry run
export LOG_LEVEL="INFO"                                         # DEBUG, INFO, WARNING, ERROR
export MAX_PARALLEL_OPERATIONS="5"                              # Concurrent operations
export DEFAULT_LIFECYCLE_POLICY_FILE="ecr-lifecycle.json"       # Filename of default ECR repository lifecycle policy
export DEFAULT_PERMISSION_POLICY_FILE="ecr-permissions.json"    # Filebname of default ECR IAM permission policy
export DEFAULT_RESOURCE_TAGS_FILE="default_resource_tags.json"  # Filename with default resource tags

Usage

Basic Usage

# Sync repositories from current directory
python -m src.main

# Sync with specific configuration directory
python -m src.main --config /path/to/repositories

# Perform dry run to see what would be changed
python -m src.main --dry-run

# Enable verbose logging
python -m src.main --verbose

Command Line Options

python -m src.main --help

Features

Core Functionality

  • 1:1:1 Repository Model: Each repository contains at most one image and one chart
  • Policy Management: Both lifecycle and permission policies supported
  • Container Runtime Abstraction: Podman primary, extensible to Docker
  • Comprehensive Logging: Detailed action tracking and summary reporting
  • Dry Run Mode: Preview changes without making modifications
  • Parallel Operations: Configurable concurrency for performance
  • Error Handling: Robust error handling with retry mechanisms

Inventory Management

  • FileSystem Inventory: Load repository definitions from YAML files
  • On-demand Probing: Efficiently probe ECR repositories as needed
  • Backward Compatibility: Supports both new and legacy YAML formats

Reconciliation Engine

  • Difference Analysis: Compare filesystem vs ECR state
  • Repository Creation: Create missing repositories
  • Policy Synchronization: Apply lifecycle and permission policies
  • Image Synchronization: Pull, tag, and push container images
  • Chart Synchronization: Helm chart support (Phase 4)

Action Logging

  • Comprehensive Tracking: All operations logged with timestamps
  • Status Reporting: Success, failure, and skip status for each action
  • Summary Generation: Human-readable operation summaries
  • Error Details: Detailed error messages and context

Examples

Example Repository Configuration

See examples/repository-1-1-1-model.yml for comprehensive examples of:

  • Mixed repositories (image + chart)
  • Image-only repositories
  • Chart-only repositories
  • Policy configurations

Example Output

================================================================================
ECR SYNC SUMMARY
================================================================================
Start Time: 2025-10-23 14:30:15 UTC
End Time: 2025-10-23 14:32:45 UTC
Duration: 150.3 seconds

Repositories Processed: 12
├── Created: 3
├── Updated: 7
└── Skipped: 2

Artifacts Synchronized:
├── Container Images: 8
└── Helm Charts: 4

Policies Applied: 15

Overall Success Rate: 96.7%
================================================================================

Testing

# Run tests with coverage
pytest tests/ --cov=src --cov-report=html

# Run specific test categories
pytest tests/unit/
pytest tests/integration/

Repository Configuration Formats

repositories:
  - name: nginx-app
    image:
      pull_url: "docker.io/library/nginx:latest"
      tags: ["latest"]

Configuration Best Practices

  1. Test with Dry Run: Use --dry-run to validate configurations
  2. Monitoring: Use action logging to track sync operations
  3. Validation: Leverage the built-in configuration validation

Contributing

  1. Follow the established architecture patterns
  2. Maintain backward compatibility
  3. Include comprehensive tests
  4. Update documentation for any changes
  5. Use type hints throughout

License

MIT

About

Utility to sync artifacts to AWS ECR

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published