PostgreSQL vs ClickHouse for Reporting

This repository runs a repeatable workload against PostgreSQL and ClickHouse to compare how they behave under a mixed insert/update/query pattern that looks like a streaming analytics pipeline for analytics/reporting.

The focus is on:

Applying realistic techniques for insertion and update (batching)
Running a small set of analytical queries continuously
Emitting detailed Prometheus metrics so one can inspect behavior in Grafana

Workload overview

The test client (test-client/) does the following :

Generates synthetic reference data:
- users, products, categories
Periodically generates and inserts events in batches (EVENTS_PER_BATCH per batch)
- Same events are written to PostgreSQL and ClickHouse
Periodically runs realistic UPDATE operations on both databases:
- User lifetime value updates
- User segment changes
- Product price updates
- Event revenue corrections
Periodically Runs 7 analytical queries on both databases at a fixed interval
Exposes Prometheus metrics (insert/update/query timings, row counts, etc.)
It organizes the query operations separately from the writing ones and, within each, separated by database. There are 4 processes, one for each of these.

ClickHouse uses replicated CollapsingMergeTree tables and exposes *_current views for the latest state. PostgreSQL uses a normalized schema with a reduced index set on the events fact table, aggressive autovacuum tuning for streaming workloads, and a "streaming-friendly" durability configuration (WAL fsync on, but synchronous_commit=off).

Query characterization

This test-suite runs seven analytical queries repeatedly; each targets a different workload pattern :

Query 1 (Simple Aggregation): tests scan performance and lightweight group-by, low-cardinality grouping, and index usage on timestamp filters.
Query 2 (Join Aggregation): tests multi-table joins plus aggregation (join scaling and join order effects).
Query 3 (Complex Join): exercises multi-dimensional group-by and joins with mixed-cardinality dimensions.
Query 4 (Window Functions): stresses windowing, ordering, re-sorting and memory/workspace usage.
Query 5 (Cohort Analysis): tests joining derived cohort tables, time bucketing and moderate group-by cardinality.
Query 6 (Funnel Analysis): tests sessionization and conditional aggregation across a user-level derived table.
Query 7 (Multi-Window Cohort): a worst-case style query for PostgreSQL combining multiple CTEs that scan the events table, many window functions, percentiles, self-joins and correlated subqueries. For real-time reporting workloads, ClickHouse uses incremental materialized views (realistic production approach) while PostgreSQL must run the direct query (only viable option). This demonstrates the fundamental architectural difference for streaming OLTP → real-time reporting.

How to run – durable replicated ClickHouse setup

The experiment is intended to be run with the durable, replicated ClickHouse setup defined in docker-compose.yml.

This stack includes:

PostgreSQL (single node) configured for streaming ingestion
ZooKeeper
Two ClickHouse replicas using ReplicatedCollapsingMergeTree
Prometheus
Grafana
The Python test client

# From the repo root
cd postgres-is-not-enough

# Start the stack
docker compose up -d

# Check containers
docker compose ps

Services (default ports):

Grafana: http://localhost:3000 (admin/admin)
Prometheus: http://localhost:9090
Test client metrics: http://localhost:800*/metrics (ports 8001 to 8004)
PostgreSQL: localhost:5432 (db: analytics, user: testuser, pass: testpass)
ClickHouse primary replica: localhost:9000 (clickhouse-replica1)

The test client connects to PostgreSQL and clickhouse-replica1 and runs the workload continuously:

Generate reference data (if the DBs are empty)
Periodically run inserts and updates (in batches) and queries, and push metrics to Prometheus

Workload parameters for this setup are controlled via environment variables in docker-compose.yml.

Observing the experiment

All observability goes through Prometheus and Grafana.

Prometheus scrape config is in prometheus/prometheus.yml.
Grafana dashboards are provisioned from grafana/dashboards/.

Useful panels include:

Insert latency histograms for PostgreSQL vs ClickHouse
Update latency histograms per operation
Query latency per query name and database
Table row counts for both databases over time

The underlying SQL for queries and updates lives in:

test-client/queries.py – 7 analytical queries (Postgres + ClickHouse variants)
test-client/postgres_client.py – Postgres inserts/updates
test-client/clickhouse_client.py – ClickHouse inserts/updates (CollapsingMergeTree)

Results

Based on metrics collected from processing 14.3M-15.2M events over a sustained benchmark run:

Note: PostgreSQL is configured with aggressive autovacuum (vacuum at 1% table change, analyze at 0.5% change) to prevent table bloat and statistics decay. Both databases use configurations that should be appropriate for streaming analytics workloads.

Query Performance

Query	PostgreSQL	ClickHouse	Speedup
Simple Aggregation	6.00s	0.11s	53x
Join Aggregation	2.45s	0.16s	15x
Complex Join	6.64s	0.49s	14x
Window Function	1.38s	0.17s	8x
Cohort Analysis	3.96s	1.14s	3.5x
Funnel Analysis	1.00s	0.14s	7x
Multi-Window Cohort (MV)	46.37s	0.09s	532x

Query 7 demonstrates the difference between PostgreSQL's direct query approach (which degrades with data volume) and ClickHouse's incremental materialized views (which maintain constant query time regardless of total data volume). At 14M+ events, PostgreSQL's 46-second query time makes real-time dashboards infeasible, while ClickHouse's 87ms response time remains dashboard-ready.

Insert Performance

Average batch insert latency (50K events per batch): 6.97s for PostgreSQL versus 1.04s for ClickHouse (6.7x faster). ClickHouse's columnar storage and optimized bulk insert path significantly outperform PostgreSQL's row-based approach at scale.

Update Performance

Operation	PostgreSQL Avg	ClickHouse Avg	Notes
User Lifetime Value (1.43M updates)	1.24s	0.15s	CH 8.4x faster
User Segments (713K updates)	0.45s	0.09s	CH 5.0x faster
Product Prices (28.5K updates)	0.02s	0.03s	Similar performance
Event Revenue (28.2K updates)	0.02s	2.70s	PG 135x faster

Event revenue updates are slower in ClickHouse because the update flow reads current state to emit compensating sign=-1/+1 rows, which forces on-the-fly collapsing. PostgreSQL uses MVCC point updates, so updating historical fact rows is cheaper in latency (at the cost of WAL/index work and eventual VACUUM).

Overall Time Distribution

PostgreSQL spent approximately 2,522 CPU seconds total across all operations over the benchmark duration. ClickHouse spent approximately 2,878 CPU seconds total, indicating more aggressive parallel processing and vectorization.

The key difference is in wall-clock time for queries: PostgreSQL's Query 7 alone takes 46 seconds per execution, making it unsuitable for real-time dashboards, while ClickHouse maintains sub-100ms query times across the entire workload through incremental materialized views.

ClickHouse's architectural advantage becomes more pronounced as data volume grows: query performance remains nearly constant while PostgreSQL degrades linearly with data size.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PostgreSQL vs ClickHouse for Reporting

Workload overview

Query characterization

How to run – durable replicated ClickHouse setup

Observing the experiment

Results

Query Performance

Insert Performance

Update Performance

Overall Time Distribution

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
clickhouse		clickhouse
grafana		grafana
postgres		postgres
prometheus		prometheus
test-client		test-client
.gitignore		.gitignore
README.md		README.md
STYLE_GUIDE.md		STYLE_GUIDE.md
docker-compose.yml		docker-compose.yml

nunoatgithub/postgres-is-not-enough

Folders and files

Latest commit

History

Repository files navigation

PostgreSQL vs ClickHouse for Reporting

Workload overview

Query characterization

How to run – durable replicated ClickHouse setup

Observing the experiment

Results

Query Performance

Insert Performance

Update Performance

Overall Time Distribution

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages