Skip to content

lucafurrer/modern-data-platform-stack

 
 

Repository files navigation

Modern Analytical Data Platform Stack

This project sets up the infrastructure for testing a modern data analytics stack with services such as

  • Kafka
  • Spark
  • Hadoop Ecosystem
  • StreamSets & NiFi
  • Zeppelin & Jupyter
  • NoSQL

and many others.

Each service runs as a Docker container and the whole stack is composed using Docker Compose. The stack can be provisioned either locally or in the cloud. See Provisioning of Analytics Platform for various versions of how to deploy the stack.

The full stack can be found in the full-stack folder. The idea is to keep a complete version of the stack in one single docker-compose definition. This can be used as a template and reduced to the services needed.

Customised versions of the stack (mainly downsized to only the services needed for a given project) can be found under the customer-poc folder.

Changes

The change log can be found here.

About

A modern data analytics stack built on container with Kafka, Spark, Streamsets, HDFS, ....

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • HTML 90.6%
  • Shell 7.5%
  • Dockerfile 1.9%