Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Cache it! (PHP) - Part 1
#1
In the good old days when building web sites was as easy as knocking up a few pages, the delivery of a web page to a browser was a simple matter of having the web server fetch a file. A site's visitors would see its small, text-only pages almost immediately, unless they were using particularly slow modems. Once the page was downloaded, the browser would store it somewhere on your local computer so that, should the page be requested again, after performing a quick check with the server to ensure the page hadn't been updated, the browser could display the locally cached version. Pages were served as quickly and efficiently as possible, and everyone was happy.

Then dynamic web pages came along and spoiled the party by introducing two problems:

• When a request for a dynamic web page is received by the server, some intermediate processing must be completed, such as the execution of scripts by the engine. This processing introduces a delay before the web server begins to deliver the output to the browser. This may not be a significant delay where simple PHP scripts are concerned, but for a more complex application, the PHP engine may have a lot of work to do before the page is finally ready for delivery. This extra work results in a noticeable time lag between the user's requests and the actual display of pages noticeable time lag between the user's requests and the actual display of pages in the browser.

• A typical web server, such as, uses the time of file modification to inform a web browser of a requested page's age, allowing the browser to take appropriate action. With dynamic web pages, the actual PHP script may change only occasionally; meanwhile, the content it display s, which is often fetched from a database, will change frequently. The web server has no way of discerning updates to the database, so it doesn't send a last modified date. If the client (that is, the user's browser) has no indication of how long the data will remain valid, it will take a guess. This is problematic if the browser decides to use a locally cached version of the page which is now out of date, or if the browser decides to request from the server a fresh copy of the page, which actually has no new content, making the request redundant. The web server will always respond with a freshly constructed version of the page, regardless of whether or not the data in the database has actually changed.

To avoid the possibility of a web site visitor viewing out-of-date content, most web developers use a meta tag or HTTP headers to tell the browser never to use a cached version of the page. However, this negates the web browser's natural ability to cache web pages, and entails some serious disadvantages. For example, the content delivered by a dynamic page may only change once a day, so there's certainly a benefit to be gained by having the browser cache a page--even if only for 24 hours.

If you're working with a small PHP application, it's usually possible to live with both issues. But as your site increases in complexity --and attracts more traffic--you'll begin to run into performance problems. Both these issues can be solved, however: the first with server-side caching; the second, by taking control of [6] caching from within your application. The exact approach you use to solve these problems will depend on your application, but in this chapter, we'll consider both PHP and a number of class libraries from as possible panaceas for your web page woes.

How do I prevent web browsers from caching a page?

If timely information is crucial to your web site and you wish to prevent out-of-date content from ever being visible, you need to understand how to prevent web browsers--and proxy servers--from caching pages in the first place.

Answer

There are two possible approaches we could take to solving this problem: using HTML meta tags, and using HTTP headers.

Using HTML Meta Tags

The most basic approach to the prevention of page caching is one that utilizes HTML meta tags:

<meta http-equiv="expires" content="Mon, 26 Jul 1997 05:00:00 GMT"/>
<meta http-equiv="pragma" content="no-cache" />

The insertion of a date that's already passed into the Expires meta tag tells the browser that the cached copy of the page is always out of date. Upon encountering this tag, the browser usually won't cache the page. Although the Pragma: no-cache meta tag isn't guaranteed, it's a fairly well-supported convention that most web browsers follow. However, the two issues associated with this approach, which we'll discuss below, may prompt you to look at the alternative solution.

Using HTTPHeaders

A better approach is to use the HTTP protocol itself, with the help of PHP's header function, to produce the equivalent of the two HTML meta tags above:

PHP Code:
<?php
header
('Expires: Mon, 26 Jul 1997 05:00:00 GMT');
header('Pragma: no-cache');
?>

We can go one step further than this, using the Cache-Control header that's supported by HTTP 1.1 -capable browsers:

PHP Code:
<?php
header
('Expires: Mon, 26 Jul 1997 05:00:00 GMT');
header('Cache-Control: no-store, no-cache, must-revalidate');
header('Cache-Control: post-check=0, pre-check=0'FALSE);
header('Pragma: no-cache');
?>

Summary

Using the Expires meta tag sounds like a good approach, but two problems are associated with it:

Using the Expires meta tag sounds like a good approach, but two problems are associated with it:

• The browser first has to download the page in order to read the meta tags. If a tag wasn't present when the page was first requested by a browser, the browser will remain blissfully ignorant and keep its cached copy of the original.
• Proxy servers that cache web pages, such as those common to ISPs, generally won't read the HTML documents themselves. A web browser might know that it shouldn't cache the page, but the proxy server between the browser and the web server probably doesn't--it will continue to deliver the same out-of-date page to the client.

On the other hand, using the HTTP protocol to prevent page caching essentially guarantees that no web browser or intervening proxy server will cache the page, so visitors will alway s receive the latest content. In fact, the first header should accomplish this on its own; this is the best way to ensure a page is not cached. The Cache-Control and Pragma headers are added for some degree of insurance. Although they don't work on all browsers or proxies, the Cache-Control and Pragma headers will catch some cases in which the Expires header doesn't work as intended--if the client computer's date is set incorrectly, for example.

Of course, to disallow caching entirely introduces the problems we discussed at the start of this chapter: it negates the web browser's natural ability to cache pages, and can create unnecessary overhead, as new versions of pages are always requested, even though those pages may not have been updated since the browser's last request. We'll look at the solution to these issues in just a moment.

To be continued …….
Support
Webnetics UK Ltd.
Reply


Forum Jump:


Users browsing this thread: 2 Guest(s)