dbt

pg_lake integrates with Data Build Tool (dbt) to update, insert, or transform data. Installing both the dbt-core (version 1.9.2+) and the database adapter dbt-postgres(version 1.9.0+) are required.

Connection

dbt runs externally over a Postgres connection. Superuser access might be needed depending on the number of changes expected.

my_dbt_project:
  outputs:
    dev:
      type: postgres
      host: atx5231hr.pgtest.us-west-2.aws.postgres.app
      user: postgres
      password: ....
      port: 5432
      dbname: postgres
      schema: public
      threads: 4
  target: dev

S3 file location

While configuring dbt, you will need to create an environment variable to point dbt at S3, like this:

export ICEBERG_LOCATION_PREFIX=<S3 location>

If you’re using the built-in Iceberg appliance for pg_lakee, your S3 bucket location can be accessed from psql with a show command, like this:

psql <connection-string> -c 'show pg_lake_iceberg.default_location_prefix'

Models configuration

The model configuration controls how the transformation process behaves.

materialized='incremental': This tells dbt to perform incremental updates instead of fully rebuilding the table each time.
unique_key='created_at': This specifies the unique identifier for each record, used to detect new records.
pre_hook and post_hook: These hooks are executed before and after the model runs. In this case, the pre_hook sets the default access method to iceberg and configures the location prefix for storing Iceberg tables in S3. The post_hook resets these settings after the model has completed.

{{ config(
    materialized='incremental',
    unique_key='created_at',
    pre_hook="SET default_table_access_method TO 'iceberg'; SET pg_lake_iceberg.default_location_prefix = '{{ env_var('ICEBERG_LOCATION_PREFIX', '') }}';",

DBT for Postgres and and pg_lake

With dbt you can run the full range of features in the dbt-postgres adaptor for loading data and performing SQL operations in Postgres.

Create and manage Postgres tables with pg_lake
Load data from outside sources to Postgres
Utilize incremental processing for new or updated data instead of full table refreshes. MERGE actions supported

DBT for Iceberg and pg_lake

dbt can be especially helpful for adding data to Iceberg for fast analytics with pg_lake. With any data source, dbt can be configured to create and populate Iceberg table data. Included in the support for dbt with Iceberg is:

Creating and managing Iceberg tables inside pg_lake
Data loading from outside sources to Iceberg
Incrementally processing and transforming data in Iceberg. This is done with a delete+insert strategy. Merge is not currently supported on Iceberg tables, but it can be used directly with the Postgres instance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dbt

Connection

Models configuration

DBT for Postgres and and pg_lake

DBT for Iceberg and pg_lake

FilesExpand file tree

dbt.md

Latest commit

History

dbt.md

File metadata and controls

dbt

Connection

Models configuration

DBT for Postgres and and pg_lake

DBT for Iceberg and pg_lake