Skip to content

Deploy to Cloud and Migrate to Cloud-Based Datastore #10

@Rich-T-kid

Description

@Rich-T-kid

Objective

Deploy the scraper to a real cloud server (AWS or Google Cloud) to ensure it works in a production environment. Migrate to a cloud-based database for persistent storage and link traversal tracking.

Tasks

  1. Cloud Deployment:

    • Deploy the scraper to a cloud server (e.g., AWS EC2, Google Cloud Compute Engine).
    • Test the scraper's performance and functionality in a cloud environment.
  2. Database Migration:

    • Migrate the local database to a cloud-based datastore (e.g., AWS RDS, Google Cloud SQL, or Firestore).
    • Ensure all data (e.g., scraped links, traversed link tracking) is correctly stored in the new cloud database.
  3. Link Traversal Tracking:

    • Integrate a caching mechanism (e.g., Redis or a cloud-based equivalent) to check if links have already been traversed.
    • Ensure this mechanism is efficient and scalable for large datasets.
  4. Testing:

    • Test the scraper end-to-end in the cloud environment, including database integration and link traversal tracking.
    • Fix any issues encountered during deployment or migration.
  5. Documentation:

    • Document the deployment process for AWS/Google Cloud.
    • Provide instructions for managing the cloud database and scaling the scraper.

Acceptance Criteria

  • The scraper is deployed to a cloud server and works without issues in a production environment.
  • Data is successfully migrated to a cloud-based database, and all interactions work as intended.
  • Link traversal tracking is functional and scalable in the cloud setup.
  • Documentation is complete and easy to follow.

Additional Notes

  • Use Terraform or similar tools for infrastructure as code to simplify deployment and future scaling.
  • Optimize the cloud environment for cost-effectiveness while maintaining performance.

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions