Performance on a Budget

A Hetzner CX22 with 2 vCPUs, 4 GB RAM, and 40 GB NVMe storage costs €4.51/month. It handles Marketflow’s traffic at launch without breaking a sweat. But poorly written queries, missing indexes, and uncached repeated computations can make even a powerful server feel slow. This chapter is about making the code fast enough that the infrastructure does not need to be upgraded until the revenue justifies it.

The target: every API response under 200 milliseconds at the 95th percentile. If the application hits this target on a €4.51 server, upgrading to the next tier (CX32 at €7.49) is a business decision, not an emergency.

The Feature

Marketflow’s dashboard loads in under one second. Market listings with 50 vendors render instantly. Vendor search returns results in under 100 milliseconds. The application handles 50 concurrent users without degradation.

The Decision

Performance optimization follows a strict order: measure first, then fix the biggest bottleneck, then measure again. The common bottlenecks in order of impact are:

N+1 queries (the most frequent and easiest to fix)
Missing database indexes (the most impactful per line of code)
Uncached repeated queries (the best return on investment for read-heavy endpoints)
Connection pool exhaustion (the most confusing when it happens)

Each fix is applied only after measurement confirms it is the actual bottleneck. Premature optimization wastes developer time.

The Implementation

Measuring Response Times

# backend/app/middleware/timing.py
import time
import logging

from starlette.middleware.base import BaseHTTPMiddleware
from starlette.requests import Request

logger = logging.getLogger(__name__)

SLOW_REQUEST_THRESHOLD_MS = 200


class TimingMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        start = time.perf_counter()
        response = await call_next(request)
        duration_ms = (time.perf_counter() - start) * 1000

        if duration_ms > SLOW_REQUEST_THRESHOLD_MS:
            logger.warning(
                f"SLOW REQUEST: {request.method} {request.url.path} "
                f"took {duration_ms:.1f}ms"
            )
        else:
            logger.info(
                f"{request.method} {request.url.path} "
                f"completed in {duration_ms:.1f}ms"
            )

        response.headers["X-Response-Time"] = f"{duration_ms:.1f}ms"
        return response

Fixing N+1 Queries

# TRAP: N+1 query loading vendors for each market
async def get_market_with_vendors(db: AsyncSession, market_id: str):
    market = await db.get(Market, market_id)
    # This triggers a separate query for EACH vendor
    for vendor in market.vendors:  # N queries
        print(vendor.name)

# SAFE: Eager loading with selectinload
from sqlalchemy.orm import selectinload

async def get_market_with_vendors(db: AsyncSession, market_id: str):
    result = await db.execute(
        select(Market)
        .options(selectinload(Market.vendors))
        .where(Market.id == market_id)
    )
    market = result.scalar_one()
    # All vendors loaded in 2 queries total (1 for market, 1 for vendors)
    for vendor in market.vendors:
        print(vendor.name)

Query Logging for Development

# backend/app/database.py (add to engine creation)
import logging

# Enable SQL query logging in development
if settings.environment == "development":
    logging.getLogger("sqlalchemy.engine").setLevel(logging.INFO)

# This logs every SQL query with its parameters.
# Watch for patterns like:
# SELECT * FROM vendors WHERE market_id = ?  (repeated 50 times)
# This means you have an N+1 query.

Database Indexes

# backend/app/models/market.py
from sqlalchemy import Index

class Market(Base):
    __tablename__ = "markets"

    # ... columns ...

    __table_args__ = (
        Index("ix_markets_organizer_id", "organizer_id"),
        Index("ix_markets_status", "status"),
        Index("ix_markets_location", "city", "state"),
    )


class Application(Base):
    __tablename__ = "applications"

    # ... columns ...

    __table_args__ = (
        # Composite index for the most common query:
        # "show me all pending applications for this market"
        Index("ix_applications_market_status", "market_id", "status"),
        Index("ix_applications_vendor_id", "vendor_id"),
    )

The rule for indexes: if a column appears in a WHERE clause or ORDER BY and the table will have more than a few hundred rows, it needs an index. For composite indexes, put the most selective column first (the one that narrows results the most).

Redis Caching

# backend/app/services/cache.py
import json
import logging
from typing import Any

import redis.asyncio as redis

from app.config import settings

logger = logging.getLogger(__name__)

pool = redis.ConnectionPool.from_url(
    settings.redis_url,
    max_connections=10,
    decode_responses=True,
)


async def get_redis() -> redis.Redis:
    return redis.Redis(connection_pool=pool)


async def cache_get(key: str) -> Any | None:
    """Get a value from cache. Returns None on miss or error."""
    try:
        r = await get_redis()
        value = await r.get(key)
        if value is not None:
            return json.loads(value)
    except redis.RedisError:
        logger.warning(f"Redis error on GET {key}", exc_info=True)
    return None


async def cache_set(key: str, value: Any, ttl_seconds: int = 300) -> None:
    """Set a value in cache with TTL. Fails silently."""
    try:
        r = await get_redis()
        await r.set(key, json.dumps(value), ex=ttl_seconds)
    except redis.RedisError:
        logger.warning(f"Redis error on SET {key}", exc_info=True)


async def cache_delete(pattern: str) -> None:
    """Delete all keys matching a pattern. Fails silently."""
    try:
        r = await get_redis()
        keys = []
        async for key in r.scan_iter(match=pattern):
            keys.append(key)
        if keys:
            await r.delete(*keys)
    except redis.RedisError:
        logger.warning(f"Redis error on DELETE {pattern}", exc_info=True)

Caching Expensive Queries

# backend/app/routers/markets.py
from app.services.cache import cache_get, cache_set, cache_delete


@router.get("/markets/{market_id}")
async def get_market(market_id: str, db: AsyncSession = Depends(get_db)):
    # Check cache first
    cache_key = f"market:{market_id}"
    cached = await cache_get(cache_key)
    if cached is not None:
        return cached

    # Cache miss: query database
    result = await db.execute(
        select(Market)
        .options(selectinload(Market.vendors))
        .where(Market.id == market_id)
    )
    market = result.scalar_one_or_none()
    if not market:
        raise HTTPException(status_code=404, detail="Market not found")

    response_data = market.to_dict()

    # Cache for 5 minutes
    await cache_set(cache_key, response_data, ttl_seconds=300)
    return response_data


@router.put("/markets/{market_id}")
async def update_market(
    market_id: str,
    updates: MarketUpdate,
    market: Market = Depends(get_market_for_organizer),
    db: AsyncSession = Depends(get_db),
):
    # Update the market
    for key, value in updates.dict(exclude_unset=True).items():
        setattr(market, key, value)
    await db.commit()

    # Invalidate cache
    await cache_delete(f"market:{market_id}")
    return market.to_dict()

Connection Pooling

# backend/app/database.py
from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession
from sqlalchemy.orm import sessionmaker

engine = create_async_engine(
    settings.database_url,
    pool_size=5,        # Connections kept open
    max_overflow=10,     # Additional connections when pool is exhausted
    pool_timeout=30,     # Seconds to wait for a connection
    pool_recycle=3600,   # Recycle connections after 1 hour
    pool_pre_ping=True,  # Test connections before use
)

async_session = sessionmaker(engine, class_=AsyncSession, expire_on_commit=False)

Five persistent connections with up to ten overflow connections. This means the application can handle 15 concurrent database operations. With FastAPI’s async model, a single request holds a connection only during actual database queries (not during the entire request lifecycle), so 15 connections handle well over 50 concurrent users.

The Trap

# TRAP: Caching user-specific data with a generic key
@router.get("/dashboard")
async def get_dashboard(user = Depends(get_current_user)):
    cached = await cache_get("dashboard")  # Same key for all users
    if cached:
        return cached  # User A sees User B's dashboard

# SAFE: Include user ID in cache key
@router.get("/dashboard")
async def get_dashboard(user = Depends(get_current_user)):
    cached = await cache_get(f"dashboard:{user.id}")
    if cached:
        return cached  # Each user gets their own cached dashboard

Cache keys must include every dimension that changes the response. User-specific data needs user ID in the key. Market-specific data needs market ID. Paginated data needs page number and page size. Forgetting a dimension means one user sees another user’s data.

The Cost

Component	Cost
Redis (Docker on same VPS)	$0 (uses existing VPS RAM)
Database indexes	$0 (slight write overhead)
Query optimization	$0 (developer time only)
Hetzner CX22	€4.51/month (unchanged)

Performance optimization on a budget is code optimization, not infrastructure upgrades. The Redis instance runs in a Docker container on the same VPS, using approximately 50 MB of RAM. The database indexes use minimal disk space. The total infrastructure cost remains €4.51/month.