Same exercise as last module, completely different constraints. Now the data isn't tiny strings — it's megabytes per row. The read/write ratio flips. The bill is real money. And the architecture decisions you make in the first 10 minutes determine whether your cloud bill is hundreds or hundreds of thousands.
Last module: a URL shortener. Tiny data, gigantic read volume, cache-everything strategy. This module: an image sharing app. The product looks superficially similar — both have an upload step and a fetch step — but every interesting constraint inverts. URLs are 100 bytes; images are millions of times larger. Shortener writes are rare and reads dominate; image apps have huge write volumes (people uploading) and very different read patterns (feed scrolling, profile views). Caching helps differently. Storage costs become real. The mental model from M.17 only takes you halfway here.
The lesson encoded in this comparison is something senior engineers internalize early: system design is not a generic skill. The patterns you apply depend on the shape of the data. A "build me a system that does X" question is only the start; the second sentence — "and the data looks like this" — is what determines whether you reach for Redis, S3, Kafka, or Postgres. Every architecture begins with the data.
Across the next four sections we'll re-run the same playbook from M.17 — requirements, estimation, decisions, lab — but applied to this very different beast. By the end you'll have a feel for when each pattern fits, and a calibrated sense of how much money each design decision actually costs.
The framework is the same: write functional and non-functional requirements first, then do back-of-envelope sizing. The numbers themselves will tell us where the hard problems are. Let's scope for an Instagram-like app at modest scale — 100K daily uploads, ~10× viewing ratio.
POST /imagesGET /images/:idTwo things to note immediately. First, durability matters more than availability here — a few seconds of outage is annoying; losing a user's wedding photos forever is unacceptable. This pushes us toward storage systems with deep replication built in (S3 stores 6 copies across 3 AZs by default). Second, upload latency is what users feel as "fast" — they tap the upload button and want to see "uploaded ✓" instantly. The actual heavy lifting (resize, optimize, transcode) can happen asynchronously after.
Those numbers immediately reshape what we worry about. 155 TB/year of permanent storage means we're not putting this in MySQL — at $0.023/GB/month for cloud object storage that's only ~$3,600/year, but in a SQL DB the same data would cost 10×. 7.5 TB/month of view traffic means egress is going to be one of our biggest line items if we hit users directly from origin. CDN is no longer "nice to have"; it's mandatory. The math tells us so before we've drawn anything.
The biggest architectural decision in any media-heavy system is this one: where does the binary data live, and where does the structured info about it live? The naive answer — "put it all in MySQL" — works for ten images and breaks for ten million. The right answer separates the two. Tiny structured metadata (rows, indexed, queryable) goes in your SQL database. Large binary blobs (the actual JPEG bytes) go in object storage: S3, GCS, or equivalent. The metadata row just holds the URL or key pointing to the blob.
CREATE TABLE images ( id UUID PRIMARY KEY, user_id UUID, uploaded_at TIMESTAMPTZ, caption TEXT, width INT, height INT, thumb_key VARCHAR(255), -- "images/xyz/thumb.jpg" medium_key VARCHAR(255), full_key VARCHAR(255), original_key VARCHAR(255), status VARCHAR(20) -- 'processing', 'ready', 'failed' ); CREATE INDEX idx_user_recent ON images(user_id, uploaded_at DESC);
~500 bytes per row. 100M images = 50 GB. Fits in one Postgres box comfortably. Indexed for the queries that matter: "this user's recent images," "all images uploaded in last 24h," etc.
# S3 bucket layout s3://app-images/ ├── 2024/01/15/ │ ├── abc123-thumb.jpg (50 KB) │ ├── abc123-medium.jpg (200 KB) │ ├── abc123-full.jpg (1 MB) │ └── abc123-original.jpg (3 MB) └── 2024/01/16/... # Object stores: # - replicate across 3+ AZs automatically # - 11 nines of durability # - $0.023/GB/month (way cheaper than EBS) # - infinitely scalable, no resizing
The key in S3 is the file path. The metadata row holds these keys; the app fetches them from S3 on demand. S3 doesn't care if you have 10 objects or 10 trillion — same API, same pricing.
This split unlocks several wins. Your SQL DB stays tiny — 50 GB of metadata is trivial for Postgres; that same Postgres holding 155 TB of JPEG blobs would be a nightmare to back up, index, and replicate. Object storage scales without operational pain — you don't provision S3 capacity; you just put more objects in. Pricing is predictable — pay per GB stored and per GB transferred, no overhead for "the box we'd need to host all this."
The same pattern applies to videos, audio files, PDFs, backups, and any other "big opaque thing." Once you internalize the metadata/blob split, you'll see it everywhere. The rule of thumb: if it's queryable structured info, SQL. If it's a chunk of bytes you only ever fetch whole, object storage. Mix them at peril.
Below: an interactive cost calculator. Pick your daily upload volume, the variants you'll generate, the storage class for originals, and whether you use a CDN. The architecture diagram updates and the live cost numbers show what each decision actually costs in dollars per month. Watch what happens when you turn off the CDN. Or move originals to Glacier. Or skip the medium variant. System design is sometimes engineering and sometimes accounting.
Baseline configuration. $414/month for 100K daily uploads with full quality variants and CDN. Try: turn off the CDN. Watch egress jump from $76 to $675. Try: move originals to Glacier. Storage drops by ~70%. Each toggle has a real-money consequence — and that's the senior-engineer lens for system design.
One thing the lab visualizes but doesn't dwell on: the upload flow is asynchronous. When the user taps "upload," they need a "success ✓" within a second or two. But resizing a 3 MB photo into four variants takes seconds per image, sometimes longer for tricky formats. Doing that work synchronously would make uploads feel painfully slow. So we split the flow into two halves: the synchronous "your bytes are safe" half, and the asynchronous "we're processing it" half.
The pattern shows up everywhere: any time a user action triggers heavy work, you split it. The synchronous half does the minimum required for the response — store the bytes, insert the row, return success. The asynchronous half consumes the resulting event from a queue and does the rest in the background, updating the row's status as it goes (processing → ready, or failed if something blew up). The client polls for status updates or receives a push notification once ready.
This is the same async pattern as analytics events in M.17, the same pattern as sending emails after a signup, the same pattern as generating PDFs of long reports. Once you've internalized "drop a message on a queue, let workers do the slow thing," it becomes the default tool for any work that doesn't need to block the user. The user experience stays snappy; the heavy lifting still happens; nobody waits.
These show up in every architecture review involving media, files, or large objects. Lock them in.
Test the media-system intuition. Click an answer; explanation drops in instantly.
Perfect. You can think through media-heavy systems and feel the dollar weight of each decision. Next: reading the diagrams others have drawn — the C4 model.
The media-heavy system patterns are reusable everywhere blobs show up.
Tiny structured rows in SQL. Big opaque bytes in object storage. The row holds the key; the blob lives wherever. Never mix them.
Pre-generate sizes for each use case. Serve the smallest one a viewer needs. Saves egress, saves CDN budget, saves user bandwidth.
User waits only for "your bytes are safe." Resize, optimize, transcode all happen on a queue + worker pipeline. The user moved on.