Ryan Giggs Derrick-Ryan-Giggs

Derrick Ryan Giggs

Data Engineer | Technical Writer | Building Scalable, Reliable Data Pipelines | Cloud & Workflow Automation

With a passion for modern data stack tooling, I specialize in building production-ready data pipelines using Python and emerging frameworks like dlt (data load tool). I focus on clean, maintainable ingestion pipelines, orchestration, and cloud-native data workflows.

Core Competencies

Languages & Querying

Core scripting and advanced querying for data engineering workflows.

Data Processing & Ingestion

Building robust batch and real-time data ingestion pipelines at scale.

Cloud & Data Platforms

Cloud data architecture and modern data warehousing solutions.

Orchestration & Infrastructure

Orchestrating reliable, production-grade data workflows.

Engineering Practices & Tooling

Data Ingestion: dlt (data load tool), PySpark
Orchestration & Workflow: Apache Airflow, Kestra, Prefect
Data Transformation: dbt, SQL
Infrastructure & Deployment: Docker, Terraform
CI/CD: GitHub Actions
Version Control: Advanced Git
Monitoring & Reliability: Structured logging, pipeline health checks, alerting
Documentation: Pipeline lineage, runbooks, data dictionaries

Featured Projects

DLT Taxi Pipeline

Built a production-ready data ingestion pipeline using dlt (data load tool) to ingest, normalize, and load NYC taxi trip data into a cloud data warehouse.
Impact: Automated end-to-end data loading with schema inference, incremental loading, and built-in data quality checks.
Key Challenge: Handling schema evolution across different taxi dataset versions while maintaining idempotent, reliable loads.

Stack: Python · dlt · SQL · GitHub Actions

DLT Analytics Engineering Pipeline

Designed and implemented a modular dlt-based analytics engineering pipeline with structured ingestion layers and transformation workflows.
Impact: Reduced manual data wrangling effort with automated schema management, enabling clean separation between ingestion and transformation layers.
Key Challenge: Structuring incremental pipeline runs that are both efficient and replayable from any checkpoint.

Stack: Python · dlt · dbt · SQL · Jupyter Notebook

PySpark Data Engineering

Leveraged PySpark to process and analyze large-scale datasets using distributed computing techniques.
Impact: Applied big data processing fundamentals to transform raw datasets into structured, analysis-ready formats.
Key Challenge: Optimizing Spark jobs for performance while maintaining code clarity and reproducibility in Jupyter notebooks.

Stack: PySpark · Python · Jupyter Notebook

Connect & Collaborate

Open to Remote & Hybrid Opportunities
GitHub: github.com/Derrick-Ryan-Giggs

Open to collaborating on interesting data infrastructure projects and discussions about data engineering, cloud architecture, and modern data stack tooling!

Last Updated: 2026-03-29 17:56:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly