Design a Social Media Platform - cracking the tech interview system design and algorithms in java 25 • Dev|Journal

Social media platforms serve billions of users generating and consuming content at extraordinary scale. The core challenge lies in delivering a personalized, low-latency news feed to hundreds of millions of daily active users while handling asymmetric follow graphs, viral content spikes, and diverse media types. This chapter walks through architecting such a system from requirements to production-ready design decisions.

Requirements

Functional Requirements

Create posts: Users publish text, images, and video content with metadata (location, tags, mentions).
Follow/Unfollow: Users establish directed follow relationships (Twitter-style) or bidirectional friendships (Facebook-style).
News feed: Each user sees a personalized feed of posts from accounts they follow, ranked by relevance.
Likes and comments: Users interact with posts through reactions, comments, and shares.
Trending topics: Surface globally or regionally popular topics based on real-time activity.
Search: Full-text search across posts, users, and hashtags.

Non-Functional Requirements

Requirement	Target
Feed generation latency	< 500ms (p99)
Total users	1 billion+
Daily active users (DAU)	500 million
Post read:write ratio	100:1
Consistency model	Eventual consistency for feed; strong consistency for follow graph mutations
Availability	99.99% uptime

Capacity Estimation

With 500M DAU and an average of 2 posts per day per active user:

Write throughput: 500M × 2 = 1 billion posts/day ≈ ~11,500 posts/sec
Feed reads: Each user checks feed ~10 times/day → 5 billion feed reads/day ≈ ~58,000 reads/sec
Average post size: ~1 KB text + metadata; media stored separately
Daily storage (text/metadata): 1B posts × 1 KB = ~1 TB/day
Media storage: Assume 20% of posts include media, average 2 MB → 200M × 2 MB = ~400 TB/day (stored in object storage)
Fanout calculation: Average user follows 200 accounts. A single post fans out to an average of 500 followers. At 11,500 posts/sec × 500 followers = 5.75 million feed writes/sec for push-based fanout.

These numbers make a pure push model infeasible for users with millions of followers — the celebrity problem demands a hybrid approach.

High-Level Design

┌─────────┐     ┌──────────────┐     ┌────────────────┐     ┌────────────┐
│  Client  │────▶│ Post Service  │────▶│ Fanout Service  │────▶│ Feed Cache │
└─────────┘     └──────────────┘     └────────────────┘     │  (Redis)   │
                                                             └─────┬──────┘
┌─────────┐     ┌──────────────┐                                   │
│  Client  │────▶│ Feed Service  │◀────────────────────────────────┘
└─────────┘     └──────┬───────┘
                       │
                       ▼
              ┌────────────────┐
              │ Ranking Engine  │
              └────────────────┘

Write path: Client → API Gateway → Post Service (persists post) → Fanout Service (distributes to follower feeds) → Feed Cache.

Read path: Client → API Gateway → Feed Service (fetches from Feed Cache, merges celebrity posts on-the-fly) → Ranking Engine (scores and orders) → Response.

Separating read and write paths allows independent scaling. The write path handles bursty post creation, while the read path optimizes for low-latency feed retrieval.

Deep Dive

News Feed Architecture

The central design question: when does feed computation happen?

Fan-Out on Write (Push Model)

When a user publishes a post, the system immediately writes that post reference into every follower’s feed cache.

Advantages:

Feed reads are a single cache lookup — extremely fast.
Client logic stays thin; no aggregation needed at read time.

Disadvantages:

The celebrity problem: a user with 50 million followers triggers 50 million write operations per post.
Wasted work: inactive users who never read their feed still receive writes.
Spikes in write load during viral moments.

Fan-Out on Read (Pull Model)

Feed assembly happens at read time. The system queries each followed account’s post timeline and merges results.

Advantages:

No wasted writes for inactive users.
Feeds are always fresh — no staleness window.

Disadvantages:

Slow reads: merging posts from hundreds of followed accounts on every request adds latency.
Complex aggregation logic at read time.

Hybrid Approach

The production-grade solution combines both models:

Normal users (< 10,000 followers): use push model. Their posts fan out immediately to all followers’ feed caches.
Celebrity users (≥ 10,000 followers): use pull model. Their posts are stored in a separate celebrity post index. At feed read time, the system merges cached feed entries with recent celebrity posts.

A follower threshold determines the boundary. This threshold is tunable — start at 10,000 and adjust based on observed fanout latency.

/// A fanout service that uses virtual threads and structured concurrency
/// to distribute a post to follower feed caches concurrently.
sealed interface FanoutResult permits FanoutResult.Success, FanoutResult.Partial {
    record Success(int deliveredCount) implements FanoutResult {}
    record Partial(int deliveredCount, List<Long> failedUserIds) implements FanoutResult {}
}

record Post(long postId, long authorId, String content, List<String> mediaUrls, long timestamp) {}

class FanoutService {
    private static final int CELEBRITY_THRESHOLD = 10_000;
    private final FollowerStore followerStore;
    private final FeedCache feedCache;

    FanoutService(FollowerStore followerStore, FeedCache feedCache) {
        this.followerStore = followerStore;
        this.feedCache = feedCache;
    }

    FanoutResult fanout(Post post) throws InterruptedException {
        long authorId = post.authorId();
        int followerCount = followerStore.getFollowerCount(authorId);

        if (followerCount >= CELEBRITY_THRESHOLD) {
            // Celebrity: store in celebrity index, skip push fanout
            feedCache.addToCelebrityIndex(post);
            return new FanoutResult.Success(0);
        }

        List<Long> followerIds = followerStore.getFollowerIds(authorId);
        var failedIds = new CopyOnWriteArrayList<Long>();

        try (var scope = StructuredTaskScope.open()) {
            List<StructuredTaskScope.Subtask<Void>> tasks = followerIds.stream()
                .map(followerId -> scope.fork(() -> {
                    feedCache.appendToFeed(followerId, post.postId());
                    return null;
                }))
                .toList();

            scope.join();

            for (int i = 0; i < tasks.size(); i++) {
                if (tasks.get(i).state() == StructuredTaskScope.Subtask.State.FAILED) {
                    failedIds.add(followerIds.get(i));
                }
            }
        }

        if (failedIds.isEmpty()) {
            return new FanoutResult.Success(followerIds.size());
        }
        return new FanoutResult.Partial(followerIds.size() - failedIds.size(), failedIds);
    }
}

The StructuredTaskScope from Java 25 manages virtual threads that fan out in parallel. Each follower’s feed update runs on its own virtual thread — millions of concurrent operations are feasible because virtual threads are lightweight and scheduled by the JVM, not the OS.

The social graph captures who-follows-whom relationships. Two fundamental models exist:

Unidirectional (Twitter-style): User A follows User B without reciprocation. The data model stores follower → followee edges.

Bidirectional (Facebook-style): Friendship requires mutual acceptance. Stored as a single edge with both directions.

Adjacency List in Key-Value Store

For follow lookups and fan-out, an adjacency list in a key-value store provides O(1) access:

following:{userId}  →  Set<followeeId>     // "who do I follow?"
followers:{userId}  →  Set<followerId>     // "who follows me?"

Redis sorted sets work well here — the score can be the follow timestamp, enabling chronological ordering.

Graph Database for Complex Queries

For queries like “friends of friends,” “mutual connections,” or “suggested follows,” a dedicated graph database (Neo4j, Amazon Neptune) excels. These queries involve multi-hop traversals that are expensive in relational or key-value stores.

/// BFS traversal to find mutual connections within a bounded depth.
record UserNode(long userId, Set<Long> followees) {}

class SocialGraphTraverser {

    List<Long> findMutualConnections(Map<Long, UserNode> graph, long userA, long userB) {
        Set<Long> followeesA = graph.getOrDefault(userA, new UserNode(userA, Set.of())).followees();
        Set<Long> followeesB = graph.getOrDefault(userB, new UserNode(userB, Set.of())).followees();

        return followeesA.stream()
            .filter(followeesB::contains)
            .toList();
    }

    List<Long> suggestFollows(Map<Long, UserNode> graph, long userId, int maxDepth) {
        Set<Long> visited = new HashSet<>();
        Queue<long[]> queue = new ArrayDeque<>(); // [userId, depth]
        List<Long> suggestions = new ArrayList<>();

        Set<Long> directFollowees = graph.getOrDefault(userId, new UserNode(userId, Set.of())).followees();
        visited.add(userId);
        visited.addAll(directFollowees);

        for (long followee : directFollowees) {
            queue.offer(new long[]{followee, 1});
        }

        while (!queue.isEmpty()) {
            long[] current = queue.poll();
            long currentUser = current[0];
            int depth = (int) current[1];

            if (depth >= maxDepth) continue;

            Set<Long> nextFollowees = graph.getOrDefault(currentUser, new UserNode(currentUser, Set.of())).followees();
            for (long next : nextFollowees) {
                if (!visited.contains(next)) {
                    visited.add(next);
                    suggestions.add(next);
                    queue.offer(new long[]{next, depth + 1});
                }
            }
        }
        return suggestions;
    }
}

This BFS-based traversal discovers second-degree and third-degree connections for follow suggestions. In production, this runs against a graph database, not an in-memory map.

Content Ranking

A chronological feed is straightforward but sub-optimal for engagement. Modern platforms rank posts using a scoring function that weighs multiple signals.

Feature Engineering

Feature	Description	Weight Range
Recency	Time decay — newer posts score higher	High
Engagement score	Likes + comments + shares (normalized)	Medium-High
Relationship strength	Interaction frequency between viewer and author	High
Content type	Video > Image > Text (platform-dependent)	Medium
Author credibility	Verified status, historical engagement rate	Low-Medium
Viewer preferences	Past interaction patterns with similar content	Medium

Scoring Formula

A weighted linear combination serves as the baseline:

$$ \text{score} = w_1 \cdot \text{recency}(t) + w_2 \cdot \text{engagement} + w_3 \cdot \text{relationship} + w_4 \cdot \text{contentBoost} $$

Where recency uses exponential decay: $\text{recency}(t) = e^{-\lambda \cdot \Delta t}$ with $\Delta t$ as hours since publication and $\lambda$ as the decay rate.

In production, a trained ML model (gradient-boosted trees or a neural ranker) replaces the linear formula. The model takes feature vectors and predicts engagement probability. Training data comes from logged impressions and user actions.

/// Ranked feed generator using a weighted scoring model.
record FeedItem(long postId, long authorId, long timestamp,
                int likes, int comments, int shares) {}

record ScoredItem(FeedItem item, double score) implements Comparable<ScoredItem> {
    @Override
    public int compareTo(ScoredItem other) {
        return Double.compare(other.score, this.score); // Descending
    }
}

class RankedFeedGenerator {
    private static final double DECAY_LAMBDA = 0.05;
    private static final double W_RECENCY = 0.35;
    private static final double W_ENGAGEMENT = 0.30;
    private static final double W_RELATIONSHIP = 0.25;
    private static final double W_CONTENT = 0.10;

    private final RelationshipStore relationshipStore;

    RankedFeedGenerator(RelationshipStore relationshipStore) {
        this.relationshipStore = relationshipStore;
    }

    List<FeedItem> rankFeed(long viewerId, List<FeedItem> candidates) {
        long now = System.currentTimeMillis();

        return candidates.stream()
            .map(item -> {
                double hoursAgo = (now - item.timestamp()) / 3_600_000.0;
                double recency = Math.exp(-DECAY_LAMBDA * hoursAgo);

                double engagement = normalize(item.likes() + item.comments() * 2 + item.shares() * 3);
                double relationship = relationshipStore.getStrength(viewerId, item.authorId());
                double contentBoost = 1.0; // Adjusted per content type in production

                double score = W_RECENCY * recency
                             + W_ENGAGEMENT * engagement
                             + W_RELATIONSHIP * relationship
                             + W_CONTENT * contentBoost;

                return new ScoredItem(item, score);
            })
            .sorted()
            .map(ScoredItem::item)
            .toList();
    }

    private double normalize(double raw) {
        return Math.log1p(raw) / 10.0; // Log normalization to dampen outliers
    }
}

The Comparator chain through ScoredItem.compareTo orders items by descending score. Log normalization on engagement counters prevents viral posts from completely dominating the feed.

Post Storage & Sharding

Schema

posts table:
  postId       BIGINT (Snowflake ID — embeds timestamp)
  authorId     BIGINT
  content      TEXT
  mediaUrls    JSON
  createdAt    TIMESTAMP
  likeCount    INT
  commentCount INT
  shareCount   INT

Sharding Strategies

Shard by authorId: All posts from a single user live on the same shard. This makes “user timeline” queries efficient (single-shard reads). Feed generation, however, requires cross-shard reads to merge posts from multiple followed users.

Shard by postId: Posts are distributed evenly across shards based on post ID hash. This prevents hotspots from prolific posters but makes user timeline queries cross-shard.

The recommended approach: shard by authorId for the post store (optimizing user profile views and celebrity post retrieval), and maintain a separate feed cache (sharded by viewerId) for news feed queries. This avoids cross-shard reads on both critical paths.

Interaction Counters

Like, comment, and share counts update frequently and must not create write contention on the post row. Use Redis atomic increments:

INCR post:{postId}:likes
INCR post:{postId}:comments

Periodically flush Redis counters back to the database in batches. This decouples high-frequency counter updates from the durable post store.

Trending detection requires real-time frequency analysis across a sliding time window.

Architecture

Event stream: Every post, hashtag usage, and interaction emits an event to a streaming pipeline (Kafka).
Sliding window counter: Count hashtag occurrences in the last 1 hour / 5 minutes / 15 minutes.
Top-K extraction: Maintain a MinHeap of size K. For each hashtag count update, if the count exceeds the heap minimum, replace it.
Approximate counting: At scale, exact counts are expensive. A Count-Min Sketch provides approximate frequency estimation with bounded error using sub-linear space.

/// Trending topic tracker using a MinHeap for top-K and sliding window counts.
record TrendingTopic(String hashtag, long count) implements Comparable<TrendingTopic> {
    @Override
    public int compareTo(TrendingTopic other) {
        return Long.compare(this.count, other.count); // Min-heap: smallest at top
    }
}

class TrendingTracker {
    private final int k;
    private final PriorityQueue<TrendingTopic> topK;
    private final Map<String, Long> windowCounts;

    TrendingTracker(int k) {
        this.k = k;
        this.topK = new PriorityQueue<>();
        this.windowCounts = new ConcurrentHashMap<>();
    }

    void recordHashtag(String hashtag) {
        long newCount = windowCounts.merge(hashtag, 1L, Long::sum);
        refreshTopK(hashtag, newCount);
    }

    private synchronized void refreshTopK(String hashtag, long count) {
        // Remove stale entry for this hashtag if present
        topK.removeIf(t -> t.hashtag().equals(hashtag));
        topK.offer(new TrendingTopic(hashtag, count));

        while (topK.size() > k) {
            topK.poll(); // Evict the least trending
        }
    }

    List<TrendingTopic> getTopK() {
        return topK.stream()
            .sorted(Comparator.reverseOrder())
            .toList();
    }

    void expireWindow(Set<String> expiredHashtags) {
        expiredHashtags.forEach(windowCounts::remove);
    }
}

In production, the Count-Min Sketch replaces the ConcurrentHashMap for memory efficiency. Multiple hash functions map each hashtag to counter arrays, and the minimum across all mapped counters provides the approximate count.

Bottlenecks & Scaling

Bottleneck	Mitigation
Celebrity fanout	Hybrid push/pull model with configurable follower threshold
Feed cache invalidation	TTL-based expiration with incremental append (no full rebuild)
Database hotspots	Consistent hashing with virtual nodes + read replicas for popular shards
Media storage	CDN for delivery, object storage (S3) for persistence, lazy transcoding for video
Feed ranking latency	Pre-compute feature vectors; use a lightweight model for online scoring
Write amplification	Batch fanout operations; use asynchronous message queues between Post Service and Fanout Service
Social graph updates	Write-behind caching; eventual consistency for follower counts

A key operational insight: monitor fanout latency percentiles per user tier. If the p99 fanout time for normal users exceeds 2 seconds, the celebrity threshold needs lowering. If celebrity post merge at read time exceeds 200ms, add more read replicas for the celebrity post index.

Interviewer Tips

Start with requirements clarification: Ask whether the platform is Twitter-like (public, asymmetric follow) or Facebook-like (private, mutual friendship). This changes graph storage and privacy filtering.
Quantify the celebrity problem early: Demonstrate you understand that a naive push model breaks at scale. The hybrid threshold is the key design insight interviewers look for.
Draw the read/write path separation: Interviewers want to see you decompose the system into independent scaling units.
Discuss trade-offs explicitly: For every design choice (push vs pull, shard by user vs post), state the trade-off and justify your decision with back-of-envelope math.
Common follow-ups:
- “How do you handle a post deletion propagating across cached feeds?” → Soft delete with lazy eviction on read.
- “How do you prevent feed staleness for users who follow only celebrities?” → Periodic background refresh of celebrity merge results.
- “How would you add stories (ephemeral content)?” → Separate TTL-based storage with a dedicated feed lane.
- “How do you handle content moderation at scale?” → Async moderation pipeline with ML classifiers; flagged content removed from feed cache.

Design a Social Media Platform (Twitter/Facebook)