Tag: optimization

HTTP Keep-Alive

Like most people, I did not know much about HTTP Keep-Alive headers other than that they could be very bad if used incorrectly. So I’ve kept them off, which is the default. But I ran across this blog post which explains the HTTP Keep-Alive, including its benefits and potential pitfalls pretty clearly.

It’s all pretty simple really. There is an overhead to opening and closing TCP connections. To alleviate this, Apache can agree to provide persistent connections by sending HTTP Keep-Alive headers. Then the browser can open a single connection to download multiple resources. But Apache won’t know when the browser is done downloading, so it simply keeps the connection open according to a Keep-Alive timeout, which is set to 15 seconds by default. The problem is the machine can only keep so many simultaneous requests open due to physical limitations (e.g. RAM, CPU, etc.) And 15 seconds is a long time.

To allow browsers to gain some parallelism on downloading files, without keeping persistent connections open too long, the Keep-Alive timeout value should be set to something very low, e.g. 2 seconds.

I’ve done this for static content only. Why only static content? It doesn’t really make much sense for the main page source itself since that’s the page the user wants to view.

I’ve mentioned before that by serving all static content on dedicated subdomains, we indirectly get the benefit of being able to optimize just those subdomains. So far, this meant:

  1. disabling .htaccess files
  2. setting a far-future Expires: header
  3. avoiding setting cookies on the subdomain

Now we can add to the list: enabling HTTP Keep-Alive headers. The VirtualHost block might look like this now:

<VirtualHost *:80>
    ServerName      static0.yourdomain.com
    ServerAlias     static1.yourdomain.com
    ServerAlias     static2.yourdomain.com
    ServerAlias     static3.yourdomain.com
    DocumentRoot    /var/www/vhosts/yourdomain.com
    KeepAlive On
    KeepAliveTimeout 2
    <Directory "/var/www/vhosts/yourdomain.com">
        AllowOverride None
        ExpiresActive On
        ExpiresByType text/css "access plus 1 year"
        ExpiresByType application/x-javascript "access plus 1 year"
        ExpiresByType image/jpeg "access plus 1 year"
        ExpiresByType image/gif "access plus 1 year"
        ExpiresByType image/png "access plus 1 year"
    </Directory>
</VirtualHost>

Serving Javascript and CSS

Editor’s note: This post formed the basis of the Front-End Optimization talk I’ve given in the past.

You’ve programmed websites for years, know the ins & outs of PHP, MySQL, why are Javascript and CSS files such a big deal? You put them in a directory, and link to them from your pages. Done. Right?

Not if you want maximum performance.

According to the Yahoo Exception Performance team:

…Only 10% of the time is spent here for the browser to request the HTML page, and for apache to stitch together the HTML and return the response back to the browser. The other 90% of the time is spent fetching other components in the page including images, scripts and stylesheets.

So static content is very important. The same Yahoo people provide us with a comprehensive list of Best (Front-end) Practices for Speeding Up Your Website.  IMO, some of the rules are more important than others, and some are more easily achieved.  Leaving aside hardware solutions (static server, CDN, etc.) for now, let’s look at six of the rules:

  1. Rule 1: Make Fewer HTTP Requests, or combine files. The less downloads the better. Simple file concatenation would do. Our goal is at most one Javascript and one CSS file per page.
  2. Rule 3: Add an Expires Header, or every static file must accompany a time-stamp so we can take advantage of the HTTP Expires: header. A time-stamp in the GET parameters might work, but some say that some CDN’s and browser/version/platform combinations will not request a new file if the query string changes. A better solution would be to put the time-stamp in the filename somewhere.
  3. Rule 4: Gzip Components. This is easily achieved by enabling mod_deflate in Apache.
  4. Rule 9: Reduce DNS Lookups. Okay, the real value in this rule is introducing parallel downloads by using at least two but no more than four host names. This is better explained here.
  5. Rule 10: Minify JavaScript, or at the least strip out all whitespace and comments. There are more sophisticated compressors out there that replace your actual variables with shorter symbols, but the chances of introducing bugs is higher.
  6. Rule 12: Remove Duplicate Scripts, which as they say is more common than you think.

Rule 3 is a matter of configuring Apache. How to achieve the other five?

As I see it, there are three broad ways to achieve them.

  1. Handle every request in real-time.  This means using a PHP file to serve the files (e.g. <link rel="stylesheet" type="text/css" href="custom_handler.php?file1.css,file2.css" /> or something like that).  It can also mean using mod_rewrite to direct incoming requests for CSS and Javascript to go to a PHP script. Either way, there is processing on every page load. Caching the end-product helps. Still, there must be a better way.
  2. Use a template or view plugin.  If you are using a templating system to dynamically generate your HTML, you can use some sort of plugin or function to read in a list of static files, check their last-modified times, and if changed build a combined, minified, time-stamped output file to serve up.  This is better than method #1 because by the time the page is built, there is a static file that is simply served to the browser.  Still, there must be a better way.
  3. The best way is to do it offline.  This means a job that checks static files to see if they’ve been modified.  If so, it processes the files and builds the output file that is directly served to the browser.  This job could be run in cron, or run manually by developers, but the best way is to make it a part of the build server.

Don’t have a build server?  That’s a whole other topic.

Slides from php|works 2008 in Atlanta, GA.  If the Sliderocket version below is problematic, here is the more popular Slideshare version.

Front End Website Optimization

View SlideShare presentation or Upload your own. (tags: yslow php)

Slides From php|works 2008

Slides from php|works 2008 in Atlanta, GA.