...

Caching hierarchies: opcode, page, browser & edge - using all levels effectively for optimal performance

Caching hierarchies deliver the fastest load times when I use each layer specifically: opcode, page, browser and edge. I show in clear steps how I combine these layers, avoid conflicts and set configurations in such a way that requests are shorter and the TTFB is visibly reduced.

Key points

To ensure a clear overview, I first summarize the core topics and align them directly with the performance targets. I explain all levels with specific settings so that implementation is successful without detours. I clearly delimit dynamic parts to preserve personalization. I optimize headers and cache keys so that there is no unnecessary waste in the cache. Finally, I bring everything together in a stringent chain so that every retrieval takes the fastest route.

  • Opcode accelerates PHP
  • Page cache shortened TTFB
  • Browser Saves bandwidth
  • Edge Reduces latency
  • Orchestration Prevents conflicts

What does „caching hierarchies“ actually mean?

I understand Hierarchy staggered caching from the server core to the end device. Each layer answers a different question: does the server have to recompile code, does PHP have to re-render the page, does the browser have to reload assets, or does an edge node deliver ready-made content close to the user. I avoid duplicating work by harmonizing the levels and assigning clear responsibilities. In this way, I reduce CPU load, backend queries and network latency without losing functionality. I can find a brief introduction to the levels in this compact guide: Caching levels simple.

Opcode caching: Accelerate PHP immediately

At Opcode-caching, I keep compiled PHP bytecode in RAM and save myself repeated parsing. This speeds up every request that touches PHP, especially CMS workloads like WordPress. I enable OPcache and size the memory generously enough so that frequent scripts are never displaced. I set moderate revalidation so that changes remain visible promptly without checking too often. In this way, I noticeably reduce both CPU load and response times.

I deliberately set typical OPcache parameters in php.ini conservatively, monitor the hit rate and adjust as necessary. I keep the number of accelerated files high enough for the project to fit completely. I use preloading for central classes so that even cold starts run faster. I deploy changes with a cache reset to avoid risking inconsistent states. I use the configuration block as a starting point and not as a rigid dogma.

opcache.enable=1
opcache.memory_consumption=256
opcache.max_accelerated_files=20000
opcache.validate_timestamps=1
opcache.revalidate_freq=2

I regularly check the OPcache-statistics, because only measurement shows whether the cache is bearing or thrash't. Hosting dashboards or PHP status pages help me to minimize the number of misses. I avoid memory values that are too small and lead to evictions. I also avoid infrequent validation so that production changes don't get stuck. With this balance, I work efficiently and safely.

Page caching: HTML without waiting time

At Page cache I save the finished HTML so that PHP and the database no longer run at all. This drastically reduces TTFB and brings the biggest jumps under load. I consistently exclude personalized paths such as shopping cart, checkout and user accounts. At the same time, I encapsulate small dynamic parts via AJAX or edge-side includes so that the rest can come hard from the cache. This keeps the site fast without losing important individuality.

I decide whether to use server-level caching or work with a plugin. On the server, I achieve the best Latency, Plugins give me flexible control in the CMS. Preload mechanisms prefill the cache so that initial calls don't wait. I clean up orphaned entries using purge rules when I update content. For particularly expensive areas, I also combine object cache so that database accesses are less frequent.

Browser caching: Keep assets local

At Browser-I leave static files such as images, CSS and JS in the local cache. Returning visitors then load almost nothing and the server remains free. I set long max-age values for unchangeable assets, which I provide with file name versioning. I add short times or must-revalidate to dynamic endpoints to keep the app up to date. This reduces bandwidth and optimizes perceived speed.

I pay attention to a clean mix of cache control, ETag and last-modified. For unchangeable files, I set immutable so that the browser does not check unnecessarily. For resources with frequent updates, I use conditional requests via ETag. I avoid ambiguous headers, because contradictory signals lead to misunderstandings. I keep control directly in the web server or via CMS plugin, depending on my environment.

Edge caching: proximity to the user

About Edge-networks, I deliver content in global PoPs, which minimizes latency and smoothes peaks. HTML, images and APIs can be served close to the user, depending on the rules. I work with cache keys that only contain necessary variables so that fragmentation remains low. Rules such as stale-while-revalidate and stale-if-error ensure that users immediately see a valid copy, even if the Origin is just warming up. International target groups benefit in particular because routing times are noticeably reduced.

I separate variants when mobile and desktop are very different. I deliberately leave out the checkout and account area at the edge to avoid collisions with sessions and cookies. I regularly check the hit rate and adjust TTLs until optimal odds are achieved. A practical in-depth look at this Edge Caching Guide with a focus on latency and network paths. I keep clean purge strategies to hand so that updates take effect immediately worldwide.

Set HTTP header correctly

The Header control how far content is allowed to travel and when it is revalidated. I use cache control to determine visibility, lifespan and revalidation obligations. ETag uniquely identifies a resource and enables if-none-match requests. Last-Modified provides a fallback for clients that ignore ETags. I keep the combination clear so that client, CDN and origin share the same expectations.

I use the following overview as a practical reference during configuration. I check each line against the resource type and the change behavior. For static files, I set long max-age values with immutable. For frequently updated content, I reduce the duration and rely on conditional requests. This keeps the data path efficient and correct.

Header Function
Cache control Controls duration, visibility, revalidation (e.g. max-age, public, must-revalidate)
ETag Unique identifier of a version, basis for conditional calls
Last-Modified Timestamp as an alternative to ETag, used for validation

Cache invalidation and freshness strategies

I am planning Invalidation as carefully as the caching itself. Selective purging by ID, tag or path prevents full flushes, which cause costs. When deploying, I only purge what has really changed. Stale-while-revalidate keeps users fast while the background loads fresh copies. Stale-if-error catches failures on Origin without degrading the user experience.

I combine short TTL with a high hit rate if content rotates frequently. For archives, media and libraries, I choose long times, version file names and remove check loads. Dashboards on the CDN or server side show me where cache buckets are too small. I then adjust the number of slots and object sizes. This constant fine-tuning makes all the difference in everyday life.

Cache keys, cookies and Vary

With slim Keys I keep the number of variants small. Only parameters that really change the result end up in the key. I use Vary headers deliberately, for example after Accept-Encoding or User-Agent classes, if necessary. Too many cookies in the key break up the cache and reduce the hit rate. I clean up unused cookies and regulate parameters that are used for tracking out of the key.

If I need to vary languages, currencies or layouts, I use specific keys such as lang=de or currency=EUR. I limit this variety to the cases that I really need. For A/B tests, I only separate the segments that have differences in content. I manage everything else on the client side or via edge logic without key explosion. This is how I keep the global cache efficient.

Object cache and transients

A Object-Cache reduces expensive database queries by keeping results in memory. For WordPress, I choose Redis or Memcached to ensure fast access to frequently requested options, queries and sessions. I use transients to temporarily store expensive calculations. I clean up these values during deployment when dependencies change. This keeps the page dynamic and still fast.

This comparison helps me for project sizes with intensive data loads: Redis vs Memcached. There I recognize the typical strengths of both systems depending on the workload. I dimension RAM and check eviction strategies to make room for rarely used objects. Monitoring hit/miss rates shows whether the configuration is working. This level ideally complements the page cache.

Combination: The optimized chain

I combine the Levels so that each request takes the shortest path. OPcache accelerates generation when HTML is actually created. The page cache provides ready-made markup for anonymous visitors. Browser caching prevents repeated asset transfers, and Edge distributes the content globally. At the very end is a clean purge and versioning strategy so that updates take effect immediately.

I keep the following table handy as a cheat sheet when I tweak the settings. I read the „Configuration“ column like a to-do list during implementation. I make sure that the levels complement each other and don't cancel each other out. This keeps the overall architecture clear and efficient. This overview prevents mistakes during planning.

Cache level Advantage Typical contents Configuration
Opcode Fast PHP execution PHP bytecode php.ini, Server-Panel
Page Low TTFB Finished HTML Server level or plugin
Browser Local reuse CSS, JS, images HTTP header, versioning
Edge Global proximity HTML and assets CDN rules, Keys, Purge

Measurement: TTFB, LCP and hit rates

I measure TTFB, to see how quickly the first byte arrives. LCP shows me whether the visible content appears on time. I use cache analytics to check hit rates and identify routes where misses accumulate. I correlate metrics with deployments, crawler load and traffic peaks. Only figures show where I need to tighten the screws.

I log response headers such as Age and CF cache status to make edge successes visible. Server logs tell me whether the page cache is working properly. If there are large deviations, I look for cookies, query parameters or vars that split the cache. I test variants with and without a logged-in state. In this way, I can quickly find the screws for stable speed.

Typical errors and fixes

Too many Variants in the cache are a frequent brake block. I reduce query parameters in the key and neutralize tracking parameters. Another classic is contradictory headers, such as no-store together with a long max-age. Empty or incorrect purges can also give the impression that the cache is not working. I quickly resolve such problems with clear rules and logs.

Another issue is plugins that write dynamic content hard-coded into the HTML. I move such elements to fragmented endpoints that cache or reload independently. Cookies often block the edge cache unintentionally; I delete unnecessary cookies early on. Poor versioning forces browsers to reload again and again; I number files consistently. This keeps the pipeline clean and resilient.

Decision tree: Who responds to a request?

I define a clear decision path to determine which level is allowed to deliver. This avoids unnecessary origin hits and reduces TTFB reproducibly.

  • 1) Is the resource immutable (versioned file)? Browser cache with long max-age and immutable.
  • 2) Is the request anonymous, GET and without sensitive cookies? Edge/page cache with public, s-maxage and stale-while-revalidate.
  • 3) Does the request contain Auth-Cookies, Authorization-Header or is POST? Origin, optionally with Object-Cache.
  • 4) Does the URL only contain cosmetic parameters (utm, fbclid)? I remove them from the cache key.
  • 5) Do you need small live parts (e.g. shopping cart count)? Fragmented via AJAX or ESI.
// pseudo logic
if (immutable_asset) return browser_cache;
if (is_get && is_anonymous && cacheable) return edge_or_page_cache;
if (needs_fragment) return cached_html + dynamic_fragment;
return origin_with_object_cache;

Mastering fragmentation: ESI, AJAX and partial rendering

I isolate dynamic islands so that the rest can cache hard. ESI is suitable for server-side injections (e.g. personalized blocks), AJAX for client-side reload points. It is important that fragments receive their own, short TTLs so that they remain up-to-date without invalidating the entire document.

  • Static basic framework: long TTL, public, s-maxage, stale-while-revalidate.
  • Dynamic fragment: short TTL, must-revalidate or no-store, if personal.
  • Error case: stale-if-error on the HTML wrapper prevents white pages.
// Example header for HTML envelope
Cache-Control: public, max-age=0, s-maxage=600, stale-while-revalidate=60, stale-if-error=86400

// Example header for personal fragment
Cache-Control: private, no-store

Avoid cache stampede and control warm-up

I prevent herd effects where many simultaneous misses flood the Origin. Soft TTL/hard TTL, request coalescing and locking are my tools. I use preloaders that warm up sitemaps or important paths cyclically and stagger TTLs so that not everything expires at the same time.

  • Soft TTL: A worker may renew expired objects while other consumers still receive stale ones.
  • Coalescing: Simultaneous requests for the same key are merged.
  • Staggered TTLs: Critical pages receive staggered runtimes to smooth out purge waves.
// Example of graduated runtimes
/home, /category/* -> s-maxage=900
/article/* -> s-maxage=1800
/search -> s-maxage=120, stale-while-revalidate=30

Align TTL design cleanly in the chain

I tune browser, edge and origin TTLs so that revalidation happens where it is most favorable. For HTML, I rely on s-maxage at the edge and keep max-age low in the browser to guarantee fast purges. For assets, I turn it around: very long browser TTL, because versioning ensures up-to-dateness.

// HTML
Cache-Control: public, max-age=0, s-maxage=600, stale-while-revalidate=60

// Versioned assets
Cache-Control: public, max-age=31536000, immutable

I avoid contradictory specifications such as no-cache together with immutable. Clear rules create consistent results throughout the hierarchy.

Compression, HTTP/2/3 and prioritization

I activate Gzip/Brotli and set the Vary header correctly so that variants are cleanly separated. With HTTP/2/3, I benefit from multiplexing and prioritization; this reduces head-of-line blocking when many assets are loaded in parallel.

# NGINX example
gzip on;
gzip_types text/css application/javascript application/json image/svg+xml;
brotli on;
brotli_types text/css application/javascript application/json image/svg+xml;
add_header Vary "Accept-Encoding" always;

# Long browser TTL for assets
location ~* .(css|js|svg|woff2|jpg|png)$ {
  expires 1y;
  add_header Cache-Control "public, max-age=31536000, immutable";
}

Authentication, cookies and security

I never cache personal content publicly. I mark requests with authorization headers or session cookies as private or specifically bypass the edge cache. At the same time, I only whitelist essential cookies so that the cache key remains lean.

  • Login/account areas: Cache control: private or no-store.
  • Public HTML pages: public, s-maxage; avoid set cookie.
  • Cookie hygiene: Remove irrelevant cookies (e.g. tracking) from the key.
// VCL-like logic
if (req.http.Authorization) { return(pass); }
if (req.http.Cookie ~ "session=") { return(pass); }
// Only necessary cookies in the key
unset req.http.Cookie: ".*";

Efficient caching of API and search endpoints

I make a strict distinction between methods: GET can be cached, POST usually not. For frequent search queries, I set short s-maxage values plus stale-while-revalidate to smooth out response times. I only cache responses with 4xx/5xx errors briefly or not at all so that corrections take effect immediately.

// Example header for public GET API
Cache-Control: public, max-age=0, s-maxage=120, stale-while-revalidate=30

// Cache errors sparingly
Cache-Control: public, s-maxage=10

Observability: headers, logs and TTFB check

I use header inspection and logs to make the chain transparent. Age, hit/miss indicators and upstream status show me where time is being lost. I use simple tools to check TTFB reproducibly and find outliers.

Measure # TTFB
curl -o /dev/null -s -w "TTFB: %{time_starttransfer}sn" https://example.org

Check # header
curl -I https://example.org | sed -n '1,20p'
# NGINX log with cache status
log_format timed '$remote_addr "$request" $status $body_bytes_sent '
                 '$upstream_cache_status $upstream_response_time $request_time';
access_log /var/log/nginx/access.log timed;

I compare the log data with deployments and purges. High miss peaks directly after rollouts indicate a missing warmup or TTLs that are too short. If Age remains permanently low, I check whether cookies are unintentionally bypassing the edge cache.

Deployment: versioning and rolling purges

I build versions into file names (e.g. app.9f3c1.js) to allow browser caching to be aggressive. For HTML, I use rolling purges that update critical pages first, followed by depth and long-runners. Blue/green deployments decouple build from release and give me time to specifically warm up caches.

// Asset pipeline
style.[hash].css
app.[hash].js
// HTML always refers to new hashes

I plan purge windows outside the peak times and monitor the hit rate immediately afterwards. This way I avoid load peaks on the Origin.

Image variants, DPR and responsive caching

I generate image variants (size, format) deterministically so that the cache key remains stable. For WebP/AVIF variants, I separate explicitly via file path or parameters instead of just via Accept headers to avoid Vary explosions. For high-resolution displays (DPR), I use srcset/sizes, which allows the browser to select the best variant and the cache to take effect for each specific asset.

<img src="img/hero-1024.jpg"
     srcset="img/hero-768.jpg 768w, img/hero-1024.jpg 1024w, img/hero-1600.jpg 1600w"
     sizes="(max-width: 768px) 90vw, 1024px" alt="">

I keep the number of variants per motif small and clear outdated sizes from the pipeline so that the cache does not fragment.

Capacity planning: cache memory and object sizes

I size caches according to real access patterns: a few large objects (images, videos) require different strategies than many small ones (HTML, JSON). I set limits for maximum object size and check whether popular objects remain in memory. A high re-use rate is more important than absolute size; I therefore trim keys, merge variants and prevent duplicates.

// Example: Limits
max_object_size = 10m
default_ttl = 600
nuke_limit = moderate (evictions without stalls)

Practical checklist for implementation

I activate OPcache with sufficient memory and check the hit rate. I then set up page caching, exclude critical paths and preload important URLs. I then set browser headers with long times for unchangeable files and versioning. In the CDN, I define cache keys, TTLs and purge strategies and activate stale-while-revalidate. Finally, I use measurement tools to check whether TTFB, LCP and edge hit rate achieve the targets.

Brief summary

I use Caching hierarchical: OPcache accelerates code, the page cache delivers HTML, browser headers keep assets local, and Edge brings content close to users. With clear keys, suitable TTLs and clever invalidation, I reduce server load, bandwidth and latency. Measured values ensure progress and show optimization potential. This creates a reliable chain from the origin to the end device. Anyone looking for additional details on global delivery will find enough starting points in practice to make their own architecture noticeably faster.

Current articles