Relational Normalization: Why Decomposition Forces Surrogate and Foreign Keys
These articles are AI-generated summaries. Please check the original sources for full details.
You need foreign keys and surrogate keys because you broke your relationships
Franck Pachot argues that normalization is a decomposition process that removes structural associations between entities. In Third Normal Form, a cohesive e-commerce order is shattered into four independent tables, requiring explicit mechanisms to restore integrity.
Why This Matters
The technical reality is that normalization dissolves physical and logical cohesion to achieve an application-agnostic representation. While this reduces redundancy, it forces developers to implement surrogate keys for value objects that have no natural identity and manage explicit transaction scopes across rows. Engineers must recognize that foreign keys are not a result of existing relationships, but a compensation for the structural decomposition required by relational storage.
Key Insights
- Normalization removes co-location and direct pointers, requiring data reconstruction at query time via joins on shared attribute values.
- Surrogate keys (UUIDs or auto-incremented integers) are artificial identities created solely because value objects must live independently in 3NF.
- The default unit of atomicity in an RDBMS is the row, whereas Domain-Driven Design (DDD) defines the aggregate as the natural consistency boundary.
- Relational models scale with functional complexity within a single team, while document models scale with organizational complexity across bounded contexts.
- ORMs like Hibernate or SQLAlchemy provide a frictionless abstraction over decomposition but often hide N+1 query problems and transaction boundaries until load reveals them.
Working Examples
Order aggregate in a document database preserving structural cohesion.
{
"_id": ObjectId("order_123"),
"customer": "Alice",
"status": "confirmed",
"shippingAddress": {
"street": "123 Main St",
"city": "Springfield",
"zip": "62704"
},
"items": [
{ "sku": "WIDGET-A", "name": "Blue Widget", "qty": 2, "price": 9.99 },
{ "sku": "GADGET-B", "name": "Red Gadget", "qty": 1, "price": 24.99 }
],
"payment": {
"method": "credit_card",
"last4": "4242",
"authorized": 44.97
}
}
Normalized Third Normal Form (3NF) representation requiring foreign and surrogate keys.
orders (order_id PK, customer, status)
order_items (item_id PK, order_id FK, sku, name, qty, price)
shipping_addresses (address_id PK, order_id FK, street, city, zip)
payments (payment_id PK, order_id FK, method, last4, authorized)
Practical Applications
- Use Case: E-commerce transaction processing. Utilize document models to align the database lock boundary with the domain aggregate root, ensuring atomic updates without cross-table transactions. Pitfall: Forcing value objects into independent tables creates the risk of orphan rows and requires serializable isolation levels.
- Use Case: Multi-application data sharing. Implement a relational model as a middle ground to serve unknown future access patterns through flexible join paths. Pitfall: As the monolithic schema grows, the blast radius of migrations (ALTER TABLE) increases, making it harder to split into microservices later.
References:
Continue reading
Next article
Recovering Hidden Malware IOCs: Beyond Classic Strings with FLARE-FLOSS
Related Content
Scalable i18n Testing in Cypress: Semantic Assertions via i18next Integration
Sebastian Clavijo Suero demonstrates how integrating i18next into Cypress prevents test failures by asserting translation keys instead of fragile hardcoded strings.
Convert API Data to SQLite: Using surveilr and Singer Taps for Cross-Platform Analysis
Turn 600+ API sources including GitHub, Jira, and Stripe into queryable SQLite tables using surveilr to eliminate rate limits and JSON parsing.
Engineering a Search Engine for 3 Million Polish Businesses: Data Pipeline Lessons
Paweł Sobkowiak aggregates data from KRS and CEIDG to index over 3 million Polish business entities into a single searchable platform.