...

PHP Garbage Collection: An Underestimated Factor in Web Hosting Performance

PHP Garbage Collection often determines whether a hosting stack runs smoothly under load or crashes during latency peaks. I'll show you how the collector eats up execution time, where it saves memory, and how I achieve measurably faster responses through targeted tuning.

Key points

This overview I summarize this in a few key statements so that you can immediately adjust the settings that really matter. I prioritize measurability because it allows me to validate decisions clearly and not grope around in the dark. I take hosting parameters into account because they have a strong influence on the effect of GC settings. I evaluate risks such as leaks and stalls because they determine stability and speed. I use the latest PHP versions because improvements from PHP 8+ onwards noticeably reduce the GC load.

  • trade-offFewer GC runs save time, more RAM buffers objects.
  • FPM tuning: pm.max_children and pm.max_requests control longevity and leaks.
  • OpCacheFewer compiles reduce pressure on allocator and GC.
  • SessionsSGC significantly reduces requests via Cron.
  • ProfilingBlackfire, Tideways, and Xdebug show real hotspots.

How the garbage collector works in PHP

PHP uses reference counting for most variables and passes cycles to the garbage collector. I observe how the collector marks cyclic structures, checks roots, and frees memory. It does not run on every request, but based on triggers and internal heuristics. In PHP 8.5, optimizations reduce the amount of potentially collectable objects, which means less frequent scanning. I set gc_status() to control runs, collected bytes, and root buffers.

Understanding triggers and heuristics

In practice, collection starts when the internal root buffer exceeds a threshold, during request shutdown, or when I explicitly gc_collect_cycles() calls. Long object chains with cyclic references fill the root buffer faster. This explains why certain workloads (ORM-heavy, event dispatchers, closures with $this-Captures) show significantly more GC activity than simple scripts. Newer PHP versions reduce the number of candidates included in the root buffer, which noticeably lowers the frequency.

Targeted control instead of blind deactivation

I do not disable collection across the board. However, in batch jobs or CLI workers, it is worth temporarily disabling GC (gc_disable()), calculate the job and, in the end, gc_enable() plus gc_collect_cycles() For FPM web requests, zend.enable_gc=1 my default setting – otherwise I risk hidden leaks with growing RSS.

Performance impact under load

Profiling Regularly shows 10–21% execution time for collection in projects, depending on object graphs and workload. In individual workflows, temporary deactivation resulted in savings of dozens of seconds, while RAM consumption increased moderately. I therefore always evaluate the trade-off: time versus memory. Frequent GC triggers cause stalls, which accumulate during high traffic. Properly dimensioned processes reduce such peaks and keep latencies stable.

Smooth tail latencies

I don't just measure the mean value, but p95–p99. This is exactly where GC stalls strike, because they coincide with peaks in the object graph (e.g., after cache misses or cold starts). Measures such as larger opcache.interned_strings_buffer, Less string duplication and smaller batches reduce the number of objects per request—and thus the variance.

PHP Memory Management in Detail

References and cycles determine how memory flows and when the collector intervenes. I avoid global variables because they extend lifetime and cause the graph to grow. Generators instead of large arrays reduce peak load and keep collections smaller. In addition, I check Memory fragmentation, because fragmented heap weakens the effective use of RAM. Good scopes and freeing large structures after use keep collection efficient.

Typical sources of cycles

  • Closureswho $this capture, while the object in turn holds listeners.
  • event dispatcher with long-lasting listener lists.
  • ORMs with bidirectional relations and unit-of-work caches.
  • Global caches in PHP (singletons), which hold references and inflate scopes.

I deliberately break such cycles: weaker coupling, lifecycle reset after batches, conscious unset() on large structures. Where appropriate, I use WeakMap or WeakReference, so that temporary object caches do not become a permanent load.

CLI workers and long-distance runners

Cyclical cleanup becomes increasingly important for queues and daemons. I collect after N jobs (N depending on payload 50–500) via gc_collect_cycles() and monitor the RSS history. If it increases despite collection, I plan an independent restart of the worker from a threshold value. This reflects the FPM logic of pm.max_requests in the CLI world.

FPM and OpCache tuning that reduces GC load

PHP-FPM determines how many processes run in parallel and how long they exist. I calculate pm.max_children roughly as (total RAM − 2 GB) / 50 MB per process and adjust it with real measurements. I use pm.max_requests to recycle processes regularly so that leaks don't stand a chance. OpCache reduces compile overhead and minimizes string duplication, which lowers the allocation volume and thus the pressure on the collection. I'm working out the details at OpCache configuration and monitor hit rates, restarts, and interned strings.

Process Manager: dynamic vs. on-demand

pm.dynamic Keeps workers warm and cushions load peaks with minimal waiting time. pm.on demand Saves RAM during low-load phases, but starts processes when needed—the start time can be noticeable in p95. I select the model that matches the load curve and test how the change affects tail latencies.

Sample calculation and limits

As a starting point, (RAM − 2 GB) / 50 MB quickly results in high values. On a 16 GB host, that would be approximately 280 workers. CPU cores, external dependencies, and actual process footprint limit the reality. I calibrate with measurement data (RSS per worker under peak payload, p95 latencies) and often end up with significantly lower values to avoid overloading the CPU and IO.

OpCache details with GC effect

  • interned_strings_buffer: Setting it higher reduces string duplication in userland and thus allocation pressure.
  • memory_consumption: Sufficient space prevents code eviction, reduces recompiles, and speeds up warm starts.
  • PreloadingPreloaded classes reduce autoload overhead and temporary structures—dimension them carefully.

Recommendations at a glance

This table bundles start values, which I then fine-tune with benchmarks and profiler data. I adapt figures to specific projects, as payloads vary greatly. The values provide a safe start without outliers. After rollout, I keep a load test window open and respond to metrics. This keeps the GC load under control and response times short.

Context key starting value Note
Process Manager pm.max_children (RAM − 2 GB) / 50 MB RAM weigh against concurrency
Process Manager pm.start_servers ≈ 25% of max_children Warm start for peak phases
Process Lifecycle pm.max_requests 500–5,000 Recycling reduces leaks
Memory memory_limit 256–512 MB Too small promotes Stalls
OpCache opcache.memory_consumption 128–256 MB High hit rate saves CPU
OpCache opcache.interned_strings_buffer 16–64 Splitting strings reduces RAM
GC zend.enable_gc 1 Leave it measurable, don't blindly disable it

Controlling session garbage collection

Sessions have their own cleanup function, which uses randomization in standard setups. I disable the probability via session.gc_probability=0 and call the cleaner via cron. This way, no user request blocks the deletion of thousands of files. I schedule the runtime every 15–30 minutes, depending on session.gc_maxlifetime. The key advantage is that the web response time remains smooth, while the cleanup happens at a separate time.

Session design and GC printing

I keep sessions small and do not serialize large object trees into them. Externally stored sessions with low latency smooth out the request path because file accesses and cleanup runs do not create a backlog in the web tier. It is important to consider the lifetime (session.gc_maxlifetime) to usage behavior and synchronize cleanup runs with off-peak windows.

Profiling and monitoring: numbers instead of gut feelings

profiler such as Blackfire or Tideways show whether the collection really slows things down. I compare runs with active GC and with temporary deactivation in an isolated job. Xdebug provides GC statistics, which I use for in-depth analysis. Important metrics are the number of runs, collected cycles, and time per cycle. With repeated benchmarks, I protect myself against outliers and make reliable decisions.

Measurement Playbook

  1. Record baseline without changes: p50/p95, RSS per worker, gc_status()values.
  2. Change a variable (e.g. pm.max_requests or interned_strings_buffer), measure again.
  3. Comparison with identical data volume and warm-up, at least 3 repetitions.
  4. Roll out in stages, monitor closely, ensure rapid reversibility.

Limits, memory_limit, and RAM calculation

memory_limit sets the cap per process and indirectly influences the frequency of collections. I first plan the real footprint: baseline, peaks, plus OpCache and C extensions. Then I choose a cap with headroom for short-term load peaks, typically 256–512 MB. For details on how this works, please refer to the article on PHP memory_limit, which makes side effects transparent. A sensible limit prevents out-of-memory errors without unnecessarily increasing the GC load.

Container and NUMA influences

In containers, the cgroup limit counts, not just the host RAM. I set memory_limit and pm.max_children I set the container limit and maintain safety margins so that the OOM killer does not strike. For large hosts with NUMA, I make sure not to pack processes too tightly in order to keep memory access consistently fast.

Architecture tips for high traffic

Scaling I solve this in stages: first process parameters, then horizontal distribution. Read-heavy workloads benefit greatly from OpCache and short start-up times. For write paths, I encapsulate expensive operations asynchronously so that the request remains light. Caching close to PHP reduces object quantities and thus the effort required to check the collection. Good hosters with powerful RAM and a clean FPM setup, such as webhoster.de, make this approach much easier.

Code and build aspects with GC impact

  • Optimize Composer autoloader: Fewer file accesses, smaller temporary arrays, more stable p95.
  • Keep the payload smallDTOs instead of huge arrays, streaming instead of bulk.
  • Strict scopes: Function scope instead of file scope, release variables after use.

These seemingly minor issues reduce allocations and cycle sizes, which directly affects the collector's work.

Error patterns and anti-patterns

Symptoms I recognize this by zigzag latencies, intermittent CPU spikes, and growing RSS values per FPM worker. Common causes are large arrays as collection containers, global caches in PHP, and missing process restarts. Session cleanup in the request path also causes sluggish responses. I counter this with generators, smaller batches, and clear lifecycles. In addition, I check whether external services trigger retries that generate hidden object floods.

Practical checklist

  • gc_status() Log regularly: runs, time per run, root buffer utilization.
  • pm.max_requests Select so that RSS remains stable.
  • interned_strings_buffer high enough to avoid duplicates.
  • batch sizes Cut in such a way that no massive pointed graphs are created.
  • Sessions Clean up decoupled, not in the request.

Sorting results: What really matters

Bottom line PHP garbage collection provides noticeable stability when I consciously control it instead of fighting it. I combine lower collector frequency with sufficient RAM and use FPM recycling to eliminate leaks. OpCache and smaller data sets reduce pressure on the heap and help prevent stalls. I use Cron to clean up sessions so that requests can breathe freely. I use metrics and profiling to ensure effectiveness and keep response times reliably low.

Current articles