With HTTP Caching, you cache the full output of a page (i.e. the response) and bypass your application entirely on subsequent requests. Of course, caching entire responses isn’t always possible for highly dynamic sites, or is it? With Edge Side Includes (ESI), you can use the power of HTTP caching on only fragments of your site.
Caching with a Gateway Cache
When caching with HTTP, the cache is separated from your application entirely and sits between your application and the client making the request.
The job of the cache is to accept requests from the client and pass them back to your application. The cache will also receive responses back from your application and forward them on to the client. The cache is the « middle-man » of the request-response communication between the client and your application.
This type of cache is known as an HTTP gateway cache and many exist such as Varnish, Squid in reverse proxy mode, and the Symfony reverse proxy.
Gateway caches are sometimes referred to as reverse proxy caches, surrogate caches, or even HTTP accelerators.
Symfony Reverse Proxy
Symfony comes with a reverse proxy (i.e gateway cache) written in PHP.
Enabling the proxy is easy: each application comes with a caching kernel (AppCache
) that wraps the default one (AppKernel
). The caching Kernel is the reverse proxy.
$kernel = new AppKernel('prod', false); $kernel->loadClassCache(); // add (or uncomment) this new line! // wrap the default AppKernel with the AppCache one $kernel = new AppCache($kernel);
The caching kernel will immediately act as a reverse proxy: caching responses from your application and returning them to the client
The AppCache
object has a sensible default configuration, but it can be finely tuned via a set of options you can set by overriding the getOptions()
method:
When you’re in debug mode (either because your booting a debug
kernel, like in app_dev.php
or you manually set the debug
option to true), Symfony automatically adds an X-Symfony-Cache
header to the response. Use this to get information about cache hits and misses.
Making your Responses HTTP Cacheable
Once you’ve added a reverse proxy cache (i.e like the Symfony reverse proxy or Varnish), you’re ready to cache your responses. To do that, you need to communicate to your cache which responses are cacheable and for how long. This is done by setting HTTP cache headers on the response.
HTTP specifies four response cache headers that you can set to enable caching:
Cache-Control
Expires
ETag
Last-Modified
These four headers are used to help cache your responses via two different models:
- Expiration Caching Used to cache your entire response for a specific amount of time (e.g. 24 hours). Simple, but cache invalidation is more difficult.
- Validation Caching More complex: used to cache your response, but allows you to dynamically invalidate it as soon as your content changes.
Expiration Caching
The expiration model can be accomplished using one of two, nearly identical, HTTP headers : Expires or Cache-Control
You can use both validation and expiration within the same Response. As expiration wins over validation.
The setExpires()
method automatically converts the date to the GMT timezone as required by the specification.
-
$response->setSharedMaxAge(3600);
Thanks to this new code, your HTTP response will have the following header:
|
This tells your HTTP reverse proxy to cache this response for 3600 seconds. If anyone requests this URL again before 3600 seconds, your application won’t hit at all.
This is super performant and simple to use. But, cache invalidation is not supported. If your content change, you’ll need to wait until your cache expires for the page to update
If you need to set cache headers for many different controller actions, check out FOSHttpCacheBundle. It provides a way to define cache headers based on the URL pattern and other request properties
Finally, for more information about expiration caching, see HTTP Cache Expiration.
Validation Caching
With expiration caching, you simply say « cache for 3600 seconds! ». But, when someone updates cached content, you won’t see that content on your site until the cache expires.
If you need to see updated content immediately, you either need to invalidate your cache or use the validation caching model.
Under this model, the cache continues to store responses. The difference is that, for each request, the cache asks the application if the cached response is still valid or if it needs to be regenerated. If the cache is still valid, your application should return 304 status code and no content. This tells the cache that is it OK to return the cached response.
Like with expiration, there are two different HTTP headers that can be used to implement the validation model: ETAG and Last-Modified.
Etag Header
The Etag header is a string header that uniquely identifies one representation of the target resource. It’s entirely generated and set by your application. for exemple, if the /about resource that’s stored by the cache is up-tp-date with what your application would return. An Etag is like a fingerprint and is used to quickly compare if two different versions of a resource are equivalent. Like fingerprints, each Etag must be unique across all representations of the same resrouce.
Last-Modified Header
The Last-Modified header is the second form of validation , the application decides whether or not the cached content has been updated based on wheter or not it’s been updated since the response was cached.
For details, see HTTP Cache Validation.
Safe Methods: Only caching GET or HEAD requests
HTTP caching only works for ‘safe’ HTTP methods (like GET and HEAD). This means two things :
- Don’t try to cache PUT, POST or DELETE requests. It won’t work and with good reason.
- You should never change the state of your application when responding ta a GET or HEAD request
Cache Invalidation
Cache invalidation is not part of the HTTP specification. Still, it can be really useful to delete various HTTP cache entries as soon as some content on your site is updated.
For details, see Cache Invalidation.
While cache invalidation is powerful, avoid it when possible. If you fail to invalidate something, outdated caches will be served for a potentially long time. Instead, use short cache lifetimes or use the validation model, and adjust your controllers to perform efficient validation checks as explained in Optimizing your Code with Validation.
Sometimes, you need that extra performance you can get when explicity invalidating. For invalidation, your application needs to detect when content changes and tell the cache to remove URLs which contain that data from its cache
If you want to use cache invalidation, have a look at the FOSHttpCacheBundle. This bundle provides services to help with various cache invalidation concepts and also documents the configuration for a couple of common caching proxies.
If one content corresponds to one URL, the purge model works well. You send a request to the cache proxy with the HTTP method PURGE (using the word « PURGE » is a convention, technically this can be any string) instead of GET
and make the cache proxy detect this and remove the data from the cache instead of going to the application to get a response.
Very difficult
Varying the Response for HTTP Cache
-
Vary: Accept-Encoding, User-Agent
The Response
object offers a clean interface for managing the Vary
header:
// set one vary header $response->setVary('Accept-Encoding'); // this method takes a header name or an array of header names for which the response varies
Using Edge Side Includes
When pages contain dynamic parts, you may not be able to cache entire pages, but only parts of it. Read Working with Edge Side Includes to find out how to configure different cache strategies for specific parts of your page.
Gateway caches are a great way to make your website perform better. But they have one limitation. they can only cache whole pages. If your pages contain dynamic sections, such as the user name or a shopping cart, you are out of luck.
Symfony provides a solution for these cases. based on a technology called ESI, or Edge Side Includes
Akamai wrote this specification almots 10 years ago and it allows specific parts of a page to have a different caching strategy than the main page.
The ESI specification describes tags you can embed in your pages ton communicate with the gateway cache. Only one tag is implemented in Symfony, include, as this is the useful one outside of Akamai context
<!DOCTYPE html> <html> <body> <!-- ... some content --> <!-- Embed the content of another page here --> <esi:include src="http://..." /> <!-- ... more content --> </body> </html>
When a request is handled, the gateway cache fetches the entire page from its cache or requests it from the backend application. If the response contains one or more ESI tags, these are processed in the same way. In other words, the gateway cache either retrieves the included page fragment from its cache or requests the page fragment from the backend application again. When all the ESI tags have been resolved, the gateway cache merges each into the main page and sends the final content to the client. All of this happens transparently at the gateway cache level
By using the esi renderer (via the render_esi()), you tell Symfony that the action should be rendered as an ESI tag.
The variable you pass is available as an argument to your controller. The variables passed throught render_esi also become part of the cache key so that you have unique caches for each combination of variables ans values.
When using a controller reference, the ESI tag should reference the embedded action as an accessible URL so the gateway cache can fetch it independently of the rest of the page. Symfony takes care of generating a unique URL for any controller reference and it is able to route them properly thanks to the FragementListener that must be enbaled in your configuration
# app/config/config.yml framework: # ... fragments: { path: /_fragment }
One great advantage of the ESI renderer is that you can make your application as dynamic as needed and at the same time, hit the application as little as possible
Once you start using ESI, remember to always use the s-maxage directive instead of max-age
Caching Pages that Contain CSRF Protected Forms
CSRF tokens are meant to be different for every user. This is why you need to be cautious if you try to cache pages with forms including them
Typically, each user is assigned a unique CSRF token, which is stored in session for validation. This means that if you do cache a page with a form containing a CSRF token, you will cache the CSRF token of the first user only. this means all users except for th first will fail CSRF validation when submitting the form.
In fact, many reverse proxies like varnish will refuse to cache a page with a CSRF token. This is because a cookie is sent in order to preserve the PHP session open and varnish’s default behavior is to not cache HTTP requests with cookies
To cache a page that contains a CSRF token, you can use more advanced caching techniques like ESI fragments, where you cache the full page and embedding the form inside an ESI tag with no cache at all.