Column Stores, Time Series, and Analytical Workloads
Column Stores, Time Series, and Analytical Workloads
The Black Box
The logistics platform’s operations team runs a daily report: “total packages delivered per warehouse, per carrier, per day, for the last 90 days.” The query runs against the package_events table in PostgreSQL. It scans 270 million rows, takes 4 minutes, and causes a 200ms latency spike on the transactional queries running concurrently. The operations team blames the database. The engineering team blames the report.
Neither is wrong. The problem is running an analytical query on a row-oriented database. The storage layout is optimized for the wrong access pattern.
The Mechanism
Row Storage: PostgreSQL
PostgreSQL stores data in 8KB pages, each containing complete rows:
Page 1: [PKG-001, DELIVERED, 2024-11-15, WH-042, carrier-A, 2.4kg, ...]
[PKG-002, IN_TRANSIT, 2024-11-15, WH-042, carrier-B, 1.1kg, ...]
[PKG-003, DELIVERED, 2024-11-14, WH-019, carrier-A, 5.0kg, ...]
The daily report needs three columns: status, warehouse_id, timestamp. But every page read loads all columns. For a row with 10 columns averaging 82 bytes, the three needed columns are 28 bytes. The other 54 bytes are read and discarded. At 270 million rows, the wasted I/O is $270M \times 54 = 14.6\text{GB}$.
Column Storage: ClickHouse
A column store organizes data by column. Each column is stored in its own file (or segment):
status.bin: [DELIVERED, IN_TRANSIT, DELIVERED, DELIVERED, SCANNED, ...]
warehouse_id.bin: [WH-042, WH-042, WH-019, WH-042, WH-007, ...]
timestamp.bin: [2024-11-15, 2024-11-15, 2024-11-14, 2024-11-15, 2024-11-13, ...]
carrier.bin: [carrier-A, carrier-B, carrier-A, carrier-C, carrier-A, ...]
weight.bin: [2.4, 1.1, 5.0, 0.8, 3.2, ...]
The daily report reads only status.bin, warehouse_id.bin, and timestamp.bin. No wasted I/O on carrier, weight, or any other column.
Compression
Column data compresses well because consecutive values in a column tend to be similar.
Run-length encoding (RLE): [DELIVERED, DELIVERED, DELIVERED, DELIVERED] becomes [DELIVERED x 4]. If the data is sorted by status, RLE compresses the status column to a few entries per distinct value.
Dictionary encoding: [WH-042, WH-019, WH-042, WH-007] becomes [0, 1, 0, 2] with a dictionary {0: WH-042, 1: WH-019, 2: WH-007}. The column stores 1-byte integers instead of 6-byte strings.
Delta encoding: timestamps [1731628800, 1731628801, 1731628802] become [1731628800, +1, +1]. For monotonically increasing values, delta encoding reduces each value to a single byte.
-- Concept: ClickHouse table for logistics analytics
CREATE TABLE package_events_analytics (
package_id String,
status LowCardinality(String), -- Dictionary-encoded automatically
timestamp DateTime,
warehouse_id LowCardinality(String),
carrier LowCardinality(String),
weight_kg Float32
)
ENGINE = MergeTree()
ORDER BY (warehouse_id, timestamp) -- Sort order determines compression and query speed
PARTITION BY toYYYYMM(timestamp); -- Monthly partitions for efficient range queries
-- Insert 270 million rows
-- Storage comparison:
-- PostgreSQL (row store): 22 GB
-- ClickHouse (columnar): 3.1 GB (compression ratio: 7.1x)
-- The daily report query:
SELECT
warehouse_id,
carrier,
toDate(timestamp) AS day,
count() AS delivered_count
FROM package_events_analytics
WHERE status = 'DELIVERED'
AND timestamp >= '2024-08-01'
AND timestamp < '2024-11-01'
GROUP BY warehouse_id, carrier, day
ORDER BY day, warehouse_id;
-- ClickHouse: 1.2 seconds (reads 3 columns, compressed)
-- PostgreSQL: 240 seconds (reads all columns, uncompressed)
The Observable Consequence
The performance difference is not magic. It is arithmetic:
| Factor | PostgreSQL | ClickHouse |
|---|---|---|
| Data read from disk | 22 GB (all columns) | 0.9 GB (3 columns, compressed) |
| CPU for decompression | None | 120ms (LZ4 is fast) |
| Rows processed per second | 1.1M/s | 225M/s |
| Total query time | 240s | 1.2s |
ClickHouse processes 200x more rows per second because it reads 24x less data. The CPU cost of decompression is negligible compared to the I/O savings.
The flip side: inserting a single package event into ClickHouse updates every column file. ClickHouse handles this by buffering inserts and writing in large batches (65,536 rows minimum by default). Single-row inserts are not supported efficiently.
The diagram compares row-oriented and column-oriented storage for the same dataset. An analytical query reading 3 of 10 columns touches 30% of the data in a column store but 100% in a row store. With compression on the columnar data, the actual I/O difference is even larger. This is the fundamental reason column stores outperform row stores for aggregation queries by orders of magnitude.
The Decision Rule
Keep transactional data in PostgreSQL. Run analytical queries in a column store. Replicate the data between them using CDC (Chapter 9) or batch ETL.
Do not run heavy aggregation queries on your OLTP database. The sequential scan competes with transactional queries for I/O bandwidth and buffer pool space. A 4-minute analytical query that fills the PostgreSQL buffer pool with report data evicts the hot pages used by your application’s transactional queries, causing a latency spike that persists until the working set is reloaded.
Do not use a column store for transactional OLTP workloads. Single-row inserts, point lookups by primary key, and update-in-place are all anti-patterns for columnar storage. The per-row overhead is too high, and the lack of row-level locking makes concurrent updates impractical.
For the logistics platform: PostgreSQL handles package creation, status updates, assignment, and real-time dashboard queries. ClickHouse handles daily/weekly/monthly delivery analytics, warehouse throughput reports, and carrier performance analysis. Debezium (Chapter 9) streams changes from PostgreSQL to ClickHouse with sub-second latency.