Choosing the Right Database: The 5-Question Architectural Test
These articles are AI-generated summaries. Please check the original sources for full details.
Time-Series, Document, or Relational? The 5-Question Test Every New Project Needs
Gabriel Anhaia outlines a five-question framework to prevent database migration failures that cost developer-weeks. One team saw telemetry latency drop from 14 seconds to under 200ms by switching to ClickHouse, while others faced incident-heavy rollbacks after moving mutable data to document stores.
Why This Matters
Database selection often fails when teams ignore the fundamental design assumptions of their storage engines, such as using columnar stores for point lookups. Misalignment leads to severe operational overhead, where a self-hosted ClickHouse cluster may require up to 1.0 full-time engineer (FTE) compared to just 0.1 FTE for managed Postgres solutions.
Key Insights
- ClickHouse ingestion reaches 2-3M points/sec compared to TimescaleDB’s 100k-500k range (sanj.dev, 2026).
- Tag-indexed engines like InfluxDB experience performance degradation when active series cardinality exceeds millions.
- Postgres effectively handles mixed mutable workloads until hot data crosses the low-TB range, where specialized partitioning is required.
- Operational costs vary significantly; self-hosted Cassandra can require 2.0 FTEs for proper backup and monitoring.
- Single-row lookup latency is higher in ClickHouse (15ms) than TimescaleDB (10ms), impacting high-concurrency operational endpoints.
Working Examples
Decision script for database selection based on workload attributes.
from dataclasses import dataclass
from typing import Literal
@dataclass
class Workload:
write_shape: Literal["append_only", "mutable", "mixed"]
read_pattern: Literal["range", "key", "fulltext", "mixed"]
consistency: Literal["acid", "eventual"]
series_cardinality: int
ops_headcount: float
def recommend(w: Workload) -> str:
if w.consistency == "acid":
return "Postgres + TimescaleDB extension" if w.write_shape == "append_only" else "Postgres (managed)"
if w.write_shape == "append_only":
if w.series_cardinality > 10000000:
return "ClickHouse" if w.ops_headcount >= 1 else "BigQuery"
return "TimescaleDB on managed Postgres"
return "Postgres"
Practical Applications
- IoT Telemetry: Use ClickHouse for 50k devices to utilize 10-30x compression. Pitfall: High-cardinality tags in tag-indexed engines leading to hardware saturation.
- E-commerce: Use Postgres for ACID-compliant orders and returns. Pitfall: Migrating to MongoDB for flexibility and losing cross-entity reporting efficiency.
- SaaS Analytics: Use managed BigQuery or Snowflake for teams with less than 0.5 FTE budget. Pitfall: Adopting complex self-hosted clusters that consume the entire engineering team’s on-call capacity.
References:
Continue reading
Next article
Evaluating Agentic Reasoning: The 7 Benchmarks Defining Frontier LLM Performance
Related Content
AI News Weekly Summary: Apr 18 - Apr 26, 2026
Vector RAG hits a ceiling on enterprise data; adding a graph layer fixes entity disambiguation and multi-hop reasoning failures. | Select the right database by analyzing write shapes and read patterns, such as ClickHouse's 2-3M points/sec ingestion rate, to avoid... | Learn how to retrieve immutable...
Relational Normalization: Why Decomposition Forces Surrogate and Foreign Keys
Normalization shatters data aggregates into independent tables, forcing engineers to reconstruct relationships via foreign keys and surrogate identity.
Architectural Shift: Replacing Singletons with Dependency Injection for Testable Code
Utkuhan Akar's team eliminated flaky test failures and hidden coupling by replacing the Singleton pattern with explicit Dependency Injection.