-
Notifications
You must be signed in to change notification settings - Fork 41
Description
name: Help Wanted
about: Request community help on a specific task or feature
title: '[HELP WANTED] Implement a new adapter for Amazon Redshift'
labels: 'help wanted'
assignees: ''
What We Need Help With
We need help creating a new adapter to connect Intugle to Amazon Redshift. This will allow users to profile, analyze, and generate data products from their Redshift data warehouses.
Background & Context
Intugle is designed to be extensible, and we want to expand our support for common data warehouses. Redshift is a popular choice, and adding an adapter for it will significantly increase the tool's utility. This task involves creating a new adapter that implements the same interface as our existing adapters, like the one for Postgres.
Current State
Currently, Intugle does not have a native connector for Amazon Redshift. Users who want to work with Redshift data have to manually export it to a format that Intugle supports (like CSV or Pandas DataFrame).
Desired Outcome
A new, fully functional Redshift adapter that is integrated into the Intugle framework. This means a user should be able to configure their Redshift connection in profiles.yml and use Intugle to:
- Connect to a Redshift instance.
- Profile tables and columns.
- Execute queries.
- Create new tables/views from queries.
- Perform semantic searches and build data products.
Scope of Work
- Create the basic scaffolding for the Redshift adapter (
src/intugle/adapters/types/redshift/). - Define the Pydantic models for Redshift connection and data configuration (
models.py). - Implement the
RedshiftAdapterclass, inheriting fromintugle.adapters.adapter.Adapter(redshift.py). - Implement all the abstract methods of the
Adapterclass for Redshift. - Register the new adapter in the adapter factory so it's discoverable by the system.
- Add optional dependencies for the Redshift driver (e.g.,
redshift_connector) topyproject.toml. - Add unit tests for the new adapter.
Technical Details
Checkout the documentation on how to implement a adapter : Implementing a Connector
The new adapter should follow the pattern of the existing PostgresAdapter. It will need to handle the specifics of connecting to Redshift and translating Intugle's operations into Redshift-compatible SQL.
Relevant Files/Modules
src/intugle/adapters/types/postgres/(as a reference)src/intugle/adapters/types/redshift/(to be created)src/intugle/adapters/factory.py(to register the new adapter)pyproject.toml(to add optional dependencies)tests/adapters/(to add new tests)
Key Concepts/Technologies
- Amazon Redshift & its SQL dialect
- Python
- Pydantic
Related Documentation
Suggested Approach
Use the PostgresAdapter as a template. Much of the logic will be similar, as Redshift is based on PostgreSQL. The main differences will likely be in the connection handling and the specific SQL syntax for certain operations.
Skills Needed
- Python development
- Data engineering
- SQL and database knowledge
- Testing (pytest)
- Documentation writing
Difficulty Level
- Intermediate - Familiarity with data engineering concepts
Time Estimate
- Medium (a few days)
Getting Started
- Fork the repository.
- Set up your development environment (see CONTRIBUTING.md).
- Follow the Implementing a Connector guide.
- Use the
PostgresAdapterimplementation as a reference for yourRedshiftAdapter.
Testing Requirements
- Unit tests for the new
RedshiftAdapterclass. - Integration tests (if possible, with a mock or local Redshift instance).
- Documentation examples.
Questions & Support
- Ask questions in the comments below.
- Join our Discord for real-time help.
Additional Context
Note on Dependencies: When choosing a database driver or any other external library, please ensure its license is permissive (e.g., MIT, Apache 2.0) to be compatible with this project's Apache 2.0 license.
We appreciate your interest in contributing to Intugle! 🙏