🌍 Internet Infrastructure 12 मिनट पढ़ें

CDN Architecture: Delivering Content at Scale

Explore how Content Delivery Networks use edge servers and caching hierarchies to serve web content faster from locations close to users.

CDN Architecture: Delivering Content at Scale

If you have ever noticed that a popular website loads quickly from anywhere in the world, a Content Delivery Network (CDN) is likely responsible. CDNs are distributed networks of servers designed to deliver web content — HTML pages, images, videos, JavaScript files, APIs — from locations physically close to the user. They are a foundational layer of the modern internet, handling a significant fraction of all internet traffic.

What Problem CDNs Solve

The fundamental issue is physics: the speed of light through fiber is approximately 200,000 km/s. A round-trip between New York and Sydney is roughly 16,000 km, imposing a theoretical minimum latency of ~80 ms. In practice, routing overhead, server processing, and queuing push that to 200+ ms.

For most web content, this means:

  • An image hosted on a single server in Virginia loads slowly in Singapore.
  • A video stream hosted in Frankfurt buffers in São Paulo.
  • A JavaScript bundle served from one location creates a performance bottleneck for global users.

CDNs solve this by replicating content at edge servers positioned close to users worldwide, so that a user in Singapore is served from a CDN node in Singapore rather than from a server in Virginia.

Points of Presence (PoPs)

A CDN's geographic footprint is measured in Points of Presence (PoPs) — physical locations where the CDN has servers. A PoP might be a small cluster of servers in a shared colocation facility, or a large purpose-built deployment in a major internet exchange.

Large CDNs have dramatically different PoP strategies:

  • Akamai — 3,000+ PoPs in ~130 countries, with servers deployed deep into ISP networks ("last mile" placement).
  • Cloudflare — 300+ cities, typically in large colocation/IX facilities, using software-defined routing rather than extreme physical distribution.
  • Fastly — ~90 PoPs, but with very high-capacity hardware per PoP and a focus on programmable edge logic.
  • Amazon CloudFront — 550+ PoPs and 13 regional edge caches integrated into the AWS ecosystem.

Cache Hierarchy

CDNs use a multi-tier cache hierarchy to balance storage cost, cache hit rates, and latency:

User
  |
Edge PoP (L1 cache) — small, fast, close to user
  |
Regional / Shield PoP (L2 cache) — larger, aggregates misses from multiple edge PoPs
  |
Origin (the actual web server)

Edge Cache (L1)

The edge server is the first point of contact for the user. It caches content locally. If the requested object is in cache (a cache hit), it is served immediately without contacting the origin. Cache hit rates for popular content can exceed 95%.

If the object is not cached (a cache miss), the edge server fetches it from the next tier.

Shield / Mid-Tier Cache (L2)

Rather than sending every cache miss directly to the origin (which could overwhelm it), many CDNs use a shield or mid-tier layer. A regional PoP aggregates cache misses from multiple edge PoPs and serves as a second-level cache. Only misses at the regional level reach the origin.

This dramatically reduces origin load. For a CDN with 300 edge PoPs, all misses from European PoPs might be collapsed into a single request to a European regional cache, which itself caches the result.

Cache Keys and Vary

What gets cached, and under what key? By default, the cache key is the URL. But HTTP caching is more nuanced:

  • Query strings?v=1.2 vs ?v=1.3 are different cache entries. CDNs can be configured to normalize or ignore query parameters.
  • Vary header — The Vary: Accept-Encoding header tells CDNs to cache separate versions for gzip and brotli. Vary: User-Agent is dangerous — it can create millions of cache variants.
  • Cookies — Requests with session cookies usually bypass caches entirely (for personalized content), though CDNs can strip cookies for specific paths to enable caching.

Cache Invalidation and Purging

One of the hardest problems in computer science is cache invalidation — ensuring that when content changes at the origin, the stale version is not served indefinitely.

TTL-Based Expiry

The simplest approach: set a Time to Live (TTL) via the Cache-Control: max-age=N HTTP header. After N seconds, the cached object is considered stale and the CDN revalidates with the origin (using If-Modified-Since or If-None-Match / ETag).

  • Long TTLs (days/weeks) maximize cache efficiency but mean changes propagate slowly.
  • Short TTLs (seconds/minutes) ensure freshness but increase origin load.

Instant Purge

Most enterprise CDNs offer an instant purge API — you call an API endpoint with a URL (or a list of URLs, or a tag/key pattern) and the CDN immediately invalidates that content across all PoPs. Cloudflare's purge typically propagates globally in under 150 ms. Fastly's surrogate keys allow purging thousands of objects simultaneously with a single API call.

Soft Purge / Stale-While-Revalidate

A softer approach: mark objects as stale without removing them from cache. The CDN continues serving the stale version while it revalidates in the background. This prevents thundering herds — the surge of requests to the origin when a popular cached object expires simultaneously across many edge nodes.

CDN Security Features

Modern CDNs have evolved from pure caching layers into security platforms:

  • DDoS mitigation — Absorbing volumetric attacks (hundreds of Gbps) at the edge before they reach the origin.
  • Web Application Firewall (WAF) — Filtering malicious requests (SQL injection, XSS, etc.) at the CDN layer.
  • Bot management — Distinguishing legitimate users from scrapers, credential stuffers, and other automated traffic.
  • TLS termination — CDNs handle HTTPS at the edge, reducing TLS overhead on origin servers and providing certificate management.
  • Rate limiting — Capping requests per IP or per user to prevent abuse.

Edge Computing

The latest evolution of CDN architecture is edge computing: running application logic at CDN PoPs, not just caching static content. Platforms like:

  • Cloudflare Workers — JavaScript/WASM functions running at 300+ edge locations.
  • Fastly Compute — WASM-based edge execution.
  • AWS Lambda@Edge and CloudFront Functions — Serverless functions attached to CDN events.

These allow developers to run authentication, A/B testing, personalization, and API logic at the edge, achieving sub-10ms response times from anywhere in the world.

Major CDN Providers

Provider Strength Notable Customers
Cloudflare Security, price/performance, Workers Millions of sites, Discord, Canva
Akamai Enterprise, media delivery, deep ISP integration Major broadcasters, banks
Fastly Developer-friendly, instant purge, programmable GitHub, The New York Times, Stripe
Amazon CloudFront AWS integration, global scale AWS customers, Netflix (in part)
Google Cloud CDN GCP integration, Google's backbone GCP customers
Bunny CDN Cost-efficient for indie developers Small/medium sites

Choosing a Cache Strategy

The optimal CDN strategy depends on your content:

Content Type Cache Strategy
Static assets (CSS, JS, images) Long TTL (1 year) + cache-busting via hash in filename
HTML pages Short TTL (minutes) or no-cache + ESI/edge logic
API responses Varies — often no-cache unless explicitly designed for caching
Video segments (HLS/DASH) Long TTL per segment; manifest files shorter
User-specific content Bypass cache or cache with user-specific key

CDNs have transformed the web from a centralized, origin-server-dependent model into a distributed edge network. For any site with global users, a CDN is not a luxury — it is a fundamental requirement for competitive performance.