Wednesday, February 13, 2013

APC Quirks

I recently experienced some head scratching behavior from PHP's APC extension that I thought it'd be good to document. Mostly, because it isn't actually buggy behavior, just unexpected, until you've dived into the settings involved.


Time Travel

Here is a script that stores a user cache entry with a TTL of 9 seconds and takes a total of at least 10 seconds to execute.

With default settings, the output may surprise you:
@0: bar
@5: bar
@10: bar
So, why didn't the last apc_fetch return false like it should? The answer is the apc.use_request_time configuration setting, which is by default "1". Every time we call apc_fetch, its comparing the expiration time of the entry to it's own time, and expires the entry if needed - however, as a default optimization, APC will forego the system call to get the current time (what we're doing every time we call time()) and just use the value that's generated at the start of the request. It should be noted that the request time is also used for all apc_store operations in this mode.

For scripts that execute sub-second, this will never be a problem and it's a great default optimization. However, if you were trying to use APC within a long running script, you need to understand this setting and its consequences. Luckily, the setting is changeable at run time so we can take advantage of the optimization by default and correct the behavior when needed in longer running scenarios.

Losing Hot Items

As we just saw, apc_fetch will expire your requested entry if its stale when you call it. However, it might be unexpected what happens to stale entries that never get an apc_fetch call. They will remain idle in a memory slot until all allocated memory is in use. At that point, it will scan for expire entries to retire. This is where it gets extra interesting. If there's still not enough space after expired entries are removed its default behavior is to clear the entire cache indiscriminately, starting over. The following script, shows the behavior. We create lots of 1K sized entries until we exhaust resources and create a purge event.

Even though the 'foo' cache entry was accessed again and again, it gets wiped:

The above example is a bit convoluted, but its meant to introduce the concept of a site usage where there are a small number of hot items and a large number of unique entries with much less frequent cache hits. They all have valid TTLs, but caching all of them over that period results in memory exhaustion. So, how can we help to preserve the hot items when we reach our memory limit? The answer is the apc.ttl and apc.user_ttl settings. These act as a secondary entry expiration mechanism. Before triggering a full purge, APC will use these to remove entries that haven't actually expired, but that have been idle (not accessed) for longer than their time period.
[@0] entries: 1025 mem: 2278032
[@1] entries: 2049 mem: 4555408
[@2] entries: 3073 mem: 6832784
[@3] entries: 4097 mem: 9110160
[@4] entries: 5121 mem: 11387536
[@5] entries: 6145 mem: 13664912
[@6] entries: 7169 mem: 15942288
[@7] entries: 8193 mem: 18219664
[@8] entries: 9217 mem: 20497040
[@9] entries: 10241 mem: 22774416
[@10] entries: 11265 mem: 25051792
[@11] entries: 12289 mem: 27329168
[@12] entries: 13313 mem: 29606544
[@13] entries: 14337 mem: 31883920
[@14] entries: 823 mem: 1830352
foo:
In my specific output, it took 13 seconds for me to hit the memory exhaustion. All my entry TTLs were set at 300, so by default this triggered a full purge (there were no removable entries). However, if I set my apc.user_ttl = 5, the script can run for its full duration and not lose its hottest item 'foo'. This is because APC is removing all the entries that haven't been accessed in the last 5 seconds.
[@0] entries: 1025 mem: 2278032
[@1] entries: 2049 mem: 4555408
[@2] entries: 3073 mem: 6832784
[@3] entries: 4097 mem: 9110160
[@4] entries: 5121 mem: 11387536
[@5] entries: 6145 mem: 13664912
[@6] entries: 7096 mem: 15779936
[@7] entries: 7936 mem: 17648096
[@8] entries: 8599 mem: 19122608
[@9] entries: 9023 mem: 20065584
[@10] entries: 9364 mem: 20823968
[@11] entries: 9152 mem: 20352480
[@12] entries: 9240 mem: 20548192
[@13] entries: 9491 mem: 21106416
[@14] entries: 9437 mem: 20986320
[@15] entries: 9214 mem: 20490368
[@16] entries: 9271 mem: 20617136
[@17] entries: 9429 mem: 20968528
[@18] entries: 9340 mem: 20770592
[@19] entries: 9674 mem: 21513408
[@20] entries: 9475 mem: 21070832
[@21] entries: 9131 mem: 20305776
[@22] entries: 8794 mem: 19556288
[@23] entries: 8956 mem: 19916576
[@24] entries: 8980 mem: 19969952
[@25] entries: 8917 mem: 19829840
[@26] entries: 8975 mem: 19958832
[@27] entries: 9146 mem: 20339136
[@28] entries: 9276 mem: 20628256
[@29] entries: 9503 mem: 21133104
[@30] entries: 9354 mem: 20801728
[@31] entries: 9036 mem: 20094496
foo:bar
It should be noted that a full purge can still be triggered if the setting still doesn't create enough space for new entries. Additionally, I want to make it clear that I'm not actually recommending you set your apc.ttl or apc.user_ttl to 5 seconds in production. This was a demonstration of what the setting does. The value you should use will be based on your own requirements and cache hit rates. If your APC cache never purges, its probably not worth messing with anyway.

No comments:

Post a Comment