Decorator-based framework for defining Databricks jobs and tasks as Python code. Define pipelines using @task, @job, and job_cluster() — they compile into Databricks Asset Bundle resources.
Writing Databricks jobs in raw YAML is tedious and disconnects task logic from orchestration configuration. databricks-bundle-decorators lets you express both in Python:
- Airflow TaskFlow-inspired pattern — define
@taskfunctions inside a@jobbody; dependencies are captured automatically from call arguments. - IoManager pattern — large data (DataFrames, datasets) flows between tasks through external storage automatically.
- Explicit task values — small scalars (
str,int,float,bool) can be passed between tasks viaset_task_value/get_task_value, like Airflow XComs. - Pure Python — write your jobs and tasks as decorated functions, run
databricks bundle deploy, and the framework generates all Databricks Job configurations for you.
uv add databricks-bundle-decoratorsWith cloud-specific extras for the built-in PolarsParquetIoManager:
uv add databricks-bundle-decorators[azure] # or [aws], [gcp], [polars]uv init my-pipeline && cd my-pipeline
uv add databricks-bundle-decorators[azure]
uv run dbxdec initThis scaffolds a complete pipeline project. Define your jobs in src/<package>/pipelines/:
import polars as pl
from databricks_bundle_decorators import job, job_cluster, params, task
from databricks_bundle_decorators.io_managers import PolarsParquetIoManager
io = PolarsParquetIoManager(
base_path="abfss://lake@account.dfs.core.windows.net/staging",
)
cluster = job_cluster(
name="small",
spark_version="16.4.x-scala2.12",
node_type_id="Standard_E8ds_v4",
num_workers=1,
)
@job(
params={"url": "https://api.github.com/events"},
cluster=cluster,
)
def my_pipeline():
@task(io_manager=io)
def extract() -> pl.DataFrame:
import requests
return pl.DataFrame(requests.get(params["url"]).json())
@task
def transform(df: pl.DataFrame):
print(df.head(10))
data = extract()
transform(data)Deploy:
databricks bundle deploy --target devFull documentation is available at boccileonardo.github.io/databricks-bundle-decorators:
- Getting Started — scaffolding, first pipeline, deploy
- How It Works — task dependencies, IoManager, task values
- Docker Deployment — pre-built container images
- API Reference —
@task,@job,IoManager, and more
git clone https://github.com/<org>/databricks-bundle-decorators.git
cd databricks-bundle-decorators
uv sync
uv run pytest tests/ -vRun the release automation action, pick patch/minor/major. The workflow bumps the version in pyproject.toml, commits, tags, builds, creates a GitHub Release, and publishes to PyPI.
uv version --bump patch # or minor, major
git commit -am "release: v$(uv version)" && git push
# Create a GitHub Release with the new tag → publish.yaml pushes to PyPI