Module 11 / 20 · Phase C — Scale & Reliability · 40 min

DNS, the internet's
phonebook.

Before any request reaches your load balancer, something has to turn google.com into 142.250.80.46. The system that does it is older than the web, runs everywhere, and is mostly invisible — until it isn't.

// What you'll know by the end

  • Why a name has to become a number
  • The four layers of the DNS hierarchy
  • The six record types that run the internet
  • How TTL trades freshness for speed
§ 01 — A small mismatch

Humans say names.
Routers speak numbers.

You type google.com into your browser. But routers, the actual machines that move packets, have no idea what google.com means. They route by IP address — a 32-bit number like 142.250.80.46. Before a single byte of your request can travel, something has to translate the name to the number. That something is DNS — the Domain Name System. It is the phonebook of the internet, and it answers billions of these queries every second without you ever noticing.

// WHAT YOU TYPE → WHAT YOUR COMPUTER ACTUALLY NEEDS
google.com
142.250.80.46
github.com
140.82.114.4
cloudflare.com
104.16.132.229
netflix.com
54.155.246.232

Without DNS, you'd be memorizing IP addresses like phone numbers, except harder. Worse: when Google changes server (which happens constantly, hourly), every address book in the world would need updating. DNS solves both problems — it gives every machine a human-readable name, lets the underlying IP change freely, and propagates updates worldwide in minutes. That's why it's the foundation under every other system we build.

§ 02 — The hierarchy

No single server
knows everything.

Here's the genius bit: nobody has a complete copy of the phonebook. There are billions of domains; no single server could hold them all, and even if one could, it would be a catastrophic single point of failure. Instead, DNS is structured as a hierarchy — each layer knows only enough to point you to the next. To find systemdesigntutorial.com, your computer asks four different servers, each one progressively more specific.

// THE DNS HIERARCHY · FOUR LAYERS, NOBODY HOLDS IT ALL

ROOT (.) "ask .dev TLD" 13 root servers worldwide .com TLD Verisign .dev TLD Google .org TLD PIR "ask Cloudflare nameserver" AUTHORITATIVE ns1.cloudflare.com "systemdesigntutorial.com = 104.21.5.42" → 104.21.5.42 returned to client "who runs .dev?" "who runs that domain?" "what's the IP?" level 0 level 1 level 2

The flow is always the same. Root servers (there are 13 logical ones, replicated worldwide via Anycast) know only one thing: which server is in charge of each top-level domain. TLD servers (.com, .dev, .org, etc.) know only one thing: which nameserver each domain registered. Authoritative servers (operated by whoever owns the domain, often through Cloudflare, Route 53, or similar) hold the actual records. Each layer is small, fast, and replicated. The system has no single point of failure.

Each server knows only enough to point you to the next. The hierarchy is the design.

One more thing: you don't ask all those servers yourself. There's a fourth player called the recursive resolver — usually your ISP's DNS server, or a public one like 8.8.8.8 (Google) or 1.1.1.1 (Cloudflare). You ask it once, and it does the four-step climb on your behalf, caches the result, and hands it back to you. That's the part the next lab walks through, step by step.

§ 03 — Record types

DNS isn't just
names → IPs.

The phonebook metaphor undersells DNS a little. It actually stores several kinds of records about a domain — IP addresses, yes, but also email server hints, security tokens, verification proofs. Each kind is called a record type, and you'll meet most of these in your first month of dealing with any production system.

A · address
Name to IPv4
The classic record. Maps a hostname to a 32-bit IPv4 address. About 80% of all DNS queries return an A record.
systemdesigntutorial.com.   A   104.21.5.42
AAAA · "quad-A"
Name to IPv6
Same idea as A, but for the much-larger 128-bit IPv6 address space. Modern services usually offer both.
systemdesigntutorial.com.   AAAA  2606:4700:3:::6815
CNAME · canonical name
Alias to another name
Maps one hostname to another. Used heavily by CDNs: your www. CNAMEs to a Cloudflare hostname which then resolves to an A record.
www.systemdesigntutorial.com.  CNAME  systemdesigntutorial.com.
MX · mail exchange
Where to send email
Lists the mail servers that accept email for this domain, with priorities. The reason emails to you@yoursite.com reach the right place.
systemdesigntutorial.com.  MX 10  mail.systemdesigntutorial.com.
TXT · text
Arbitrary metadata
Free-form strings. Used heavily for email anti-spam (SPF, DKIM, DMARC) and for proving you own a domain (Google verification, SSL challenges).
systemdesigntutorial.com.  TXT  "v=spf1 include:_spf.google.com ~all"
NS · name server
Who's authoritative
Identifies the nameservers responsible for this domain. The TLD layer returns these to point the resolver at the authoritative server.
systemdesigntutorial.com.  NS  ns1.cloudflare.com.

There are more (PTR for reverse lookups, SRV for service discovery, CAA for SSL authority hints, SOA for admin info), but those six handle 95% of what you'll actually encounter. The mental model: a DNS record is a (name, type, value) triple, and querying DNS means asking "what's the X record for Y?" — where X is one of these types.

§ 04 — DNS resolution walkthrough · interactive lab

Watch a query
climb the ladder.

Below: a working DNS resolution. Pick a domain, hit Resolve, and watch the query travel through every cache and every server. Then resolve it again — the second time, almost everything is cached. Hit Clear caches to start over.

DNS_RESOLVER.SIM // m.11 lab
DOMAIN:
// CLIENT SIDE BROWSER Chrome / Firefox cache OS RESOLVER stub + system cache RECURSIVE 8.8.8.8 / 1.1.1.1 does the recursive climb // DNS HIERARCHY ROOT (.) a.root-servers.net .dev TLD operated by Google AUTHORITATIVE ns1.cloudflare.com — ip address pending — resolved in 0ms ready · press Resolve to begin first resolution will be cold (all cache misses)
// Stats
Total time
Network hops
0
Cache hits
0
Cache misses
0
// Cache state
Browser cacheempty
OS resolverempty
Recursive (8.8.8.8)empty
// Step log
// WHAT JUST HAPPENED
Press Resolve to begin

A cold first resolution makes 4 network queries: client → recursive → root → TLD → authoritative. Each adds latency. But every layer caches the result, so the second lookup is dramatically faster — usually a single browser-cache hit. This caching is why DNS feels invisible most of the time.

§ 05 — TTL · the freshness dial

Every record has
an expiry date.

Every DNS record carries a TTL — Time To Live — which tells every cache between you and the authoritative server how long it's allowed to hold the answer. After the TTL expires, the next request triggers a fresh lookup. This is the dial that balances speed (long TTL means more cache hits) against flexibility (short TTL means changes propagate fast).

// PICKING A TTL · WHAT EACH RANGE BUYS YOU

60 seconds// very short
Changes propagate within a minute. Each lookup likely triggers a real query — load on your authoritative server multiplies. Use it temporarily before a planned IP change ("lowering TTL in preparation"). Don't leave it here.
VOLATILE
5 minutes// short
Good balance for things that might need to fail over quickly — your application's load balancer endpoint, for instance. Caches refresh often, so a region failure can be redirected within a few minutes.
FAILOVER-READY
1 hour// typical default
The sweet spot for most records. Most lookups hit caches, but you can still update a record and see changes everywhere within the hour. 3600 seconds is one of the most common values in production.
DEFAULT
24 hours// long
For very stable records that effectively never change — MX records to your email provider, NS records for your nameservers. Minimizes load on authoritative servers and gives near-perfect cache hit rates.
STABLE
7 days// very long
For things you're confident will never change. Rare outside the deepest infrastructure (root server NS records use longer still). The cost of being wrong is high: changes take a week to fully propagate.
SET-AND-FORGET

The pragmatic workflow: set TTL to your normal default (1 hour, say) for stable records. When you know a change is coming, lower it to 5 minutes a day in advance. Make the change. Watch traffic shift. Then raise the TTL back up. This is one of the small operational rituals that distinguishes a calm migration from a 3-hour outage.

One last subtlety worth knowing: browsers and OSes don't always honor TTLs exactly. Chrome caches DNS for ~60 seconds regardless of what the record says; some Windows versions hold for much longer. The TTL is a hint, not a contract. Plan for some clients to take 2× the TTL to update.

§ 06 — Eight words for the name layer

Vocabulary,
for the phonebook.

You'll see these in every postmortem about a DNS outage — which is to say, in every postmortem. Get fluent.

DNS
/diː ɛn ɛs/
Domain Name System. The distributed database that maps human names (google.com) to machine addresses (142.250.80.46). Older than the web, foundational to everything.
A record
/eɪ ˈrɛkɔːd/
"Address record." Maps a hostname to a single IPv4 address. The most common record type and the answer to "what's the IP of X?"
TTL
/tee tee ell/
"Time To Live." Seconds a record can be cached before requiring a fresh lookup. Trades latency for freshness — the central knob in DNS engineering.
Recursive Resolver
/rɪˈkɜːsɪv/
The server that does the multi-step climb (root → TLD → authoritative) on your behalf and caches the result. Public ones: 8.8.8.8, 1.1.1.1.
Authoritative
/ɔːˌθɒrɪˈteɪtɪv/
The "source of truth" nameserver for a domain — the server that holds the actual records. Run by the domain owner (often through Cloudflare, Route 53, etc.).
Propagation
/ˌprɒpəˈɡeɪʃən/
The time for a DNS change to become visible everywhere on the internet. Usually bounded by your TTL, sometimes longer due to misbehaving clients.
CNAME
/ˈsiː neɪm/
"Canonical Name." An alias from one hostname to another. The chain ends at an A or AAAA record. Used heavily by CDNs and managed services.
Anycast
/ˈɛnɪkɑːst/
A network technique where the same IP address is hosted in many physical locations. Requests automatically route to the nearest. How root servers and modern CDNs work.
§ 07 — Knowledge check

Five questions.
Trust the cache.

Locking in the DNS intuition. Click an answer; the explanation arrives instantly.

QUESTION 1 OF 5
Loading question...
Score: 0 / 5
5 / 5

Resolved.

The phonebook is no longer mysterious. Next stop: CDNs — how the internet uses DNS to deliver content from servers near you.

§ 08 — The recap

Three ideas to
carry forward.

DNS is the invisible substrate. Knowing how it works prevents 80% of "production is broken and nobody knows why" incidents.

i

Hierarchy, not central registry

Root → TLD → Authoritative. Each layer holds little, points to the next. No single point of failure, scales to billions of domains.

ii

Caching is everywhere

Browser, OS, recursive resolver — all cache results. The first lookup is slow; every subsequent one is near-instant. This is what makes DNS feel fast.

iii

TTL is the freshness dial

Long TTL = fast and stable. Short TTL = nimble but loaded. Most records are 1 hour by default; lower before planned changes.

↓ UP NEXT

M.12 — CDNs,
and the global cache.

DNS made names work. Now the next layer turns it into a superpower: serving content from a copy near every user. Time to look at content delivery networks — how Netflix, Cloudflare, and Fastly hide latency behind geography.

Continue to Module 12 →