Skip to content

AdrienDuval/njango_DRF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Data Lake API - Django REST Framework

A Django REST Framework API for managing and accessing data lake resources with role-based access control and comprehensive logging.

πŸš€ Features

  • Role-based Access Control: Secure access to data lake resources based on user permissions
  • Data Lake Integration: Direct access to CSV, JSON, and Parquet files in organized data lake structure
  • Advanced Data Filtering: Filter transaction data by country, status, amount, rating, and more
  • Pagination Support: Efficient handling of large datasets with configurable page sizes
  • Field Projection: Select specific columns to reduce data transfer and improve performance
  • JWT Authentication: Secure token-based authentication using Simple JWT
  • API Documentation: Auto-generated OpenAPI/Swagger documentation with drf-spectacular
  • Access Logging: Comprehensive logging of all API access with middleware
  • Resource Management: Admin interface for managing resources and access rules
  • Data Preview: Quick preview functionality for data lake files

πŸ“ Project Structure

njango_drf/
β”œβ”€β”€ api_project/                 # Django project
β”‚   β”œβ”€β”€ api_project/            # Main project settings
β”‚   β”‚   β”œβ”€β”€ settings.py         # Django configuration
β”‚   β”‚   β”œβ”€β”€ urls.py            # Main URL routing
β”‚   β”‚   └── wsgi.py            # WSGI configuration
β”‚   β”œβ”€β”€ core/                  # Core application
β”‚   β”‚   β”œβ”€β”€ models.py          # Data models (Customer, Resource, AccessRule, AccessLog)
β”‚   β”‚   β”œβ”€β”€ views.py           # API views and endpoints
β”‚   β”‚   β”œβ”€β”€ serializers.py     # DRF serializers
β”‚   β”‚   β”œβ”€β”€ permissions.py     # Custom permission classes
β”‚   β”‚   β”œβ”€β”€ middleware.py      # Access logging middleware
β”‚   β”‚   β”œβ”€β”€ admin.py           # Django admin configuration
β”‚   β”‚   └── urls.py            # App URL routing
β”‚   └── manage.py              # Django management script
β”œβ”€β”€ data_lake/                 # Data lake directory
β”‚   β”œβ”€β”€ transactions_flat/     # Transaction data (organized by date)
β”‚   β”œβ”€β”€ AMOUNT_5MIN_PER_TYPE/  # Aggregated transaction amounts
β”‚   β”œβ”€β”€ STATUS_PAR_TRANSACTION/ # Transaction status data
β”‚   └── ...                    # Other data lake resources
β”œβ”€β”€ requirements.txt           # Python dependencies
β”œβ”€β”€ .gitignore                # Git ignore rules
└── README.md                 # This file

πŸ› οΈ Prerequisites

Before you begin, ensure you have the following installed:

  • Python 3.8+
  • PostgreSQL 12+
  • Git

πŸ“¦ Installation

1. Clone the Repository

git clone <your-repository-url>
cd njango_drf

2. Create and Activate Virtual Environment

# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

3. Install Dependencies

pip install -r requirements.txt

4. Database Setup

Install PostgreSQL

  • Windows: Download from PostgreSQL official website
  • macOS: brew install postgresql
  • Ubuntu/Debian: sudo apt-get install postgresql postgresql-contrib

Create Database

-- Connect to PostgreSQL as superuser
psql -U postgres

-- Create database
CREATE DATABASE api_project;

-- Create user (optional, you can use existing user)
CREATE USER api_user WITH PASSWORD 'your_password';

-- Grant privileges
GRANT ALL PRIVILEGES ON DATABASE api_project TO api_user;

Update Database Configuration

Edit api_project/api_project/settings.py:

DATABASES = {
    "default": {
        "ENGINE": "django.db.backends.postgresql",
        "NAME": "api_project",
        "USER": "your_username",  # Change this
        "PASSWORD": "your_password",  # Change this
        "HOST": "localhost",
        "PORT": "5432",  # Default PostgreSQL port
    }
}

5. Run Migrations

cd api_project
python manage.py migrate

6. Create Superuser

python manage.py createsuperuser

7. Load Sample Data (Optional)

# Load sample data lake resources
python manage.py shell

In the Django shell:

from core.models import Resource

# Add sample resources (adjust paths to your data_lake directory)
resources = [
    Resource(name="transactions_flat", path="../data_lake/transactions_flat", kind="folder"),
    Resource(name="AMOUNT_5MIN_PER_TYPE", path="../data_lake/AMOUNT_5MIN_PER_TYPE", kind="folder"),
    Resource(name="STATUS_PAR_TRANSACTION", path="../data_lake/STATUS_PAR_TRANSACTION", kind="folder"),
    # Add more resources as needed
]

for resource in resources:
    resource.save()

πŸš€ Running the Application

Development Server

cd api_project
python manage.py runserver

The API will be available at:

πŸ“š API Endpoints

Authentication

  • POST /api/token/ - Obtain JWT access token
  • POST /api/token/refresh/ - Refresh JWT token

Resources

  • GET /api/resources/ - List all resources (Admin only)
  • POST /api/resources/ - Create new resource (Admin only)
  • GET /api/resources/{id}/ - Get resource details (Admin only)
  • PUT /api/resources/{id}/ - Update resource (Admin only)
  • DELETE /api/resources/{id}/ - Delete resource (Admin only)

Access Rules

  • GET /api/access-rules/ - List all access rules (Admin only)
  • POST /api/access-rules/ - Create access rule (Admin only)
  • GET /api/access-rules/{id}/ - Get access rule details (Admin only)
  • PUT /api/access-rules/{id}/ - Update access rule (Admin only)
  • DELETE /api/access-rules/{id}/ - Delete access rule (Admin only)

Data Access

  • GET /api/transactions-flat/ - Access transactions_flat data with advanced filtering, pagination, and field projection (Requires read permission)
    • Query Parameters:
      • page (int): Page number (default: 1)
      • page_size (int): Rows per page (default: 10)
      • country (str): Filter by country
      • status (str): Filter by transaction status
      • category (str): Filter by product category
      • method (str): Filter by payment method
      • amount_gt (float): Amount greater than
      • amount_lt (float): Amount less than
      • rating_gt (int): Customer rating greater than
      • rating_lt (int): Customer rating less than
      • fields (str): Comma-separated list of columns to return

Customers

  • GET /api/customers/ - List customers
  • POST /api/customers/ - Create customer
  • GET /api/customers/{id}/ - Get customer details
  • PUT /api/customers/{id}/ - Update customer
  • DELETE /api/customers/{id}/ - Delete customer

πŸ” Authentication & Authorization

JWT Authentication

The API uses JWT tokens for authentication. To access protected endpoints:

  1. Obtain a token:
curl -X POST http://127.0.0.1:8000/api/token/ \
  -H "Content-Type: application/json" \
  -d '{"username": "your_username", "password": "your_password"}'
  1. Use the token in subsequent requests:
curl -X GET http://127.0.0.1:8000/api/resources/ \
  -H "Authorization: Bearer your_access_token"

Role-based Access Control

  • Admin Users: Full access to all endpoints
  • Regular Users: Access based on AccessRule permissions
  • Resource Access: Users can only access resources they have been granted access to

πŸ“Š Data Lake Integration

The API provides access to various data lake resources:

  • Transaction Data: Raw and processed transaction data
  • Aggregated Data: Time-based aggregations (5-minute intervals)
  • Status Data: Transaction status tracking
  • Anonymized Data: Privacy-compliant transaction data

Data Formats Supported

  • CSV: Comma-separated values
  • JSONL: JSON Lines format
  • Parquet: Columnar storage format

πŸ§ͺ Testing

Run the test suite:

# Run all tests
python manage.py test

# Run with coverage
pytest --cov=core

πŸš€ Deployment

Production Settings

For production deployment, update settings.py:

DEBUG = False
ALLOWED_HOSTS = ['your-domain.com']

# Use environment variables for sensitive data
import os
SECRET_KEY = os.environ.get('SECRET_KEY')
DATABASES['default']['PASSWORD'] = os.environ.get('DB_PASSWORD')

Environment Variables

Create a .env file for production:

SECRET_KEY=your-secret-key
DEBUG=False
DB_PASSWORD=your-db-password
ALLOWED_HOSTS=your-domain.com

πŸ“ Development

Code Style

The project uses Black for code formatting and flake8 for linting:

# Format code
black .

# Check linting
flake8 .

# Sort imports
isort .

Adding New Data Resources

  1. Create a new view in core/views.py following the pattern of TransactionsFlatView
  2. Add the resource to the database via admin interface
  3. Grant appropriate access permissions to users
  4. Update API documentation

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ†˜ Support

If you encounter any issues or have questions:

  1. Check the API Documentation
  2. Review the Django logs for error messages
  3. Ensure database connectivity and permissions
  4. Verify data lake file paths and permissions

πŸ”„ Version History

  • v1.1.0 - Enhanced data access with filtering and pagination

    • Advanced filtering for transaction data (country, status, amount, rating)
    • Pagination support with configurable page size
    • Field projection to select specific columns
    • Enhanced API documentation with OpenAPI schema
    • Extended JWT token lifetime configuration
  • v1.0.0 - Initial release with core functionality

    • JWT authentication
    • Resource-based access control
    • Data lake integration
    • API documentation

Happy coding! πŸŽ‰

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors