This guide provides instructions on how to run the Starlake Data Stack using Docker Compose. The configuration supports multiple orchestrators and services through Docker Compose profiles.
- Docker: Ensure Docker is installed and running on your machine.
- Docker Compose: Ensure Docker Compose is installed (usually included with Docker Desktop).
Starlake provides prebuilt data stacks that can be run with a single command. These data stacks are designed to provide a ready-to-use data management solution out of the box.
The Pragmatic Open Data Stack is a ready-to-use data stack that includes Starlake with Airflow and Gizmo. It is designed to provide a comprehensive data management solution out of the box.
The Pragmatic BigQuery Data Stack is a ready-to-use data stack that includes Starlake with Airflow . It is designed to provide a comprehensive data management solution out of the box for BigQuery users.
The Pragmatic Snowflake Data Stack is a ready-to-use data stack that includes Starlake with Snoflake Tasks. It is designed to provide a comprehensive data management solution out of the box for Snowflake users.
The Pragmatic W Redshift Data Stack is a ready-to-use data stack that includes Starlake with Airflow. It is designed to provide a comprehensive data management solution out of the box for AWS Redshift users.
Running Starlake on Docker is as easy as running a single command. This guide will walk you through the steps to run Starlake on Docker.
- Clone this repository
git clone https://github.com/starlake-ai/starlake-docker.git- Change directory to the cloned repository
cd starlake-docker/docker- Run the following command to start Starlake UI with Airflow and Gizmo on Docker
docker compose --profile airflow --profile gizmo up- Open your browser and navigate to
http://localhostto access Starlake UI
The Docker Compose configuration uses environment variables which can be set in a .env file in this directory or exported in your shell.
Common variables include:
SL_API_HTTP_FRONT_URL: URL for the Starlake Proxy/UI (default:http://localhost:${SL_PORT:-80})SL_API_DOMAIN: Domain for the Starlake Proxy/UI (default:localhost). This must be set if you are settings theSL_API_HTTP_FRONT_URLto a different host. Usually it is the same as the host of theSL_API_HTTP_FRONT_URLdomain name.SL_PORT: Port for the Starlake Proxy/UI (default:80)SL_DB_PORT: Port for the Postgres Database (default:5432)SL_AI_PORT: Port for the Starlake Agent (default:8000)PROJECTS_DATA_PATH: Path to your projects directory (default:./projects)
See the docker-compose.yml file for a full list of variables and their default values.
Starlake uses Docker Compose profiles to manage different configurations (e.g., Airflow 2 vs Airflow 3, Dagster). You must specify a profile when running commands.
airflow: Runs Starlake with Airflow 2.airflow3: Runs Starlake with Airflow 3 (experimental).dagster: Runs Starlake with Dagster (requiresdocker-compose-dagster.ymlif running separately, but defined in main compose for some services).gizmo: Runs the Starlake Gizmo service.minio: Runs MinIO Object Storage.snowflake: Profile for Snowflake integration.
To start the Pragmatic Duck Data Stack with Airflow & Minio and Gizmo, use the following command:
SL_API_APP_TYPE=ducklake docker compose --profile airflow --profile minio --profile gizmo up -dTo start the Pragmatic Duck Data Stack with Airflow and Gizmo on local file system, use the following command:
SL_API_APP_TYPE=ducklake docker compose --profile airflow --profile gizmo up -dTo start the stack with a specific profile (e.g., airflow) and address any Cloud Datawarehouses, use the following commands:
docker compose --profile airflow up -dTo run with Airflow 3 (experimental), use the following command:
docker compose --profile airflow3 up -dTio run with Dagster, use the following command:
docker compose --profile dagster up -dTo stop the services:
docker compose --profile airflow --profile minio --profile gizmo downNote: You must specify the same profiles used to start the services to ensure they are all stopped correctly.
Once up, the services are accessible at the following default URLs:
- Starlake UI:
http://localhost:80(or local port defined bySL_PORT)
- Check Logs:
docker compose --profile airflow logs -f
- Rebuild Images:
If you need to update the images or changes in Dockerfiles:
docker compose --profile airflow build
- Database Persistence:
Postgres data is persisted in the
pgdatavolume. To reset the database, you may need to remove this volume:docker compose down -v
Note
Whenever you update using git pull, run docker-compose with the --build flag:
docker compose up --build
If you are affected by this Docker issue, please upgrade your Docker install.
If you have any starlake container projects and want to mount it:
- run
setup_mac_nfs.shif you are on mac in order to expose your folder via NFS. Modify the root folder to share if necessary. By default it is set to /user. This change is not specific to starlake and may be used in other container. - comment
- external_projects_data:/external_projectsin thevolumessection of the starlake-nas container - uncomment
- starlake-prj-nfs-mount:/external_projectsright below the line above in the docker compose file - go to the end of the file and comment uncomment the
starlake-prj-nfs-mount:section as follows:
starlake-prj-nfs-mount:
driver: local
driver_opts:
type: nfs
o: addr=host.docker.internal,rw,nolock,hard,nointr,nfsvers=3
device: ":/path_to_starlake_project_container" # absolute path to folder on your host where projects are located.
Starlake container folder should contain the starlake project folder:
/path_to_starlake_project_container
|
- my_first_starlake_project
|
- metadata
- ...
|
- my_second_starlake_project
|
- metadata
- ...
If you have many container projects, create as many volume as needed.
To stop Starlake UI, run the following command in the same directory
docker compose down


