RMRM Full Stack & AI Engineer · All guides · Roadmaps
Web · guide

HTTP Caching Explained

HTTP caching is a mechanism that stores copies of HTTP responses so future requests can be served faster, reducing latency, bandwidth, and server load. Understanding it helps developers build faster, more efficient web applications.

What Is HTTP Caching?

HTTP caching allows browsers, proxies, and CDNs to store copies of responses (HTML, CSS, JS, images, API data) and reuse them without hitting the origin server every time. A cache sits between the client and the server and intercepts matching requests. When a valid cached copy exists, it is returned directly — this is called a cache hit. When none exists or the copy is stale, the request travels to the origin server — a cache miss.

Why It Matters

Caching dramatically reduces page load times by serving assets from memory or a nearby edge node rather than a distant origin server. It lowers bandwidth costs and reduces server CPU load, which improves scalability. For end users, a well-cached site can feel nearly instant on repeat visits. For APIs, caching reduces database queries and downstream service calls.

Key HTTP Cache-Control Headers

The Cache-Control header is the primary tool for controlling caching behavior, replacing the older Expires header. Common directives include max-age=<seconds> (how long a response is fresh), no-cache (must revalidate with the server before using the cached copy), no-store (never cache the response), and public or private (whether shared caches like CDNs may store it). For example, Cache-Control: public, max-age=86400 tells any cache to store the response for one day. Combining directives like private, no-cache is common for user-specific data that still benefits from conditional revalidation.

Cache Validation: ETags and Last-Modified

When a cached response expires, the browser can ask the server if it has changed rather than downloading it fresh — this is revalidation. The server issues an ETag (a unique fingerprint of the content) or a Last-Modified timestamp with the original response. On the next request, the browser sends If-None-Match: <etag> or If-Modified-Since: <date>; if unchanged, the server replies with 304 Not Modified and no body, saving bandwidth. ETags are generally preferred because they handle sub-second changes and content-addressed resources more accurately.

Cache Busting

Long max-age values improve performance but mean users could receive stale assets after a deployment. The standard solution is cache busting: embedding a content hash or version string in the asset URL (e.g., main.a3f9c2.js). Because the URL changes whenever the content changes, caches treat it as a brand-new resource and fetch it immediately. This technique lets you set max-age=31536000 (one year) safely on static assets while still shipping updates instantly.

Key Gotcha: Vary Header and Shared Caches

The Vary header tells caches that a response may differ based on certain request headers, such as Vary: Accept-Encoding or Vary: Accept-Language. If omitted when serving different representations, a shared cache might return a gzip-encoded response to a client that only accepts plain text, or the wrong language variant. Over-using Vary: * effectively disables shared caching entirely, so use it precisely. Always audit Vary headers when debugging unexpected cache behavior in CDNs or reverse proxies.

Go deeper with an AI tutor that teaches this in context — and quizzes you on it.
Open the app — free to start

© RM Full Stack & AI Engineer · All guides · Roadmaps · Open the app