This one is filed under “that’s pretty picky, but I guess it couldn’t hurt.”
The Entity Tags (ETags) HTTP header is a string that uniquely identifies a specific version of resource. When the browser first downloads a resource, it stores the ETag. When it requests it again, it sends along the ETag to the server. If the server sees the same ETag, it will respond with a 304 Not Modified response, saving the download.
The problem is that the default format for the ETag (in Apache) is inode-size-timestamp. And the inode will be different from server to server, meaning the server may see a different ETag from the browser, even thought it is in fact an identical file.
The end result is ETags generated by Apache and IIS for the exact same component won’t match from one server to another. If the ETags don’t match, the user doesn’t receive the small, fast 304 response that ETags were designed for; instead, they’ll get a normal 200 response along with all the data for the component. If you host your web site on just one server, this isn’t a problem. But if you have multiple servers hosting your web site, and you’re using Apache or IIS with the default ETag configuration, your users are getting slower pages, your servers have a higher load, you’re consuming greater bandwidth, and proxies aren’t caching your content efficiently.
There is another scenario where it isn’t a problem: if you are using sticky sessions in your load balancer.
In any case, as stated above, it couldn’t hurt to rectify this. So I configured the ETag format in Apache to exclude the inode, and use only size and timestamp.
FileETag MTime Size
So files across servers have the same ETag.
Editor’s note: This post formed the basis of the Front-End Optimization talk I’ve given in the past.
You’ve programmed websites for years, know the ins & outs of PHP, MySQL, why are Javascript and CSS files such a big deal? You put them in a directory, and link to them from your pages. Done. Right?
Not if you want maximum performance.
According to the Yahoo Exception Performance team:
…Only 10% of the time is spent here for the browser to request the HTML page, and for apache to stitch together the HTML and return the response back to the browser. The other 90% of the time is spent fetching other components in the page including images, scripts and stylesheets.
So static content is very important. The same Yahoo people provide us with a comprehensive list of Best (Front-end) Practices for Speeding Up Your Website. IMO, some of the rules are more important than others, and some are more easily achieved. Leaving aside hardware solutions (static server, CDN, etc.) for now, let’s look at six of the rules:
mod_deflate in Apache.Rule 3 is a matter of configuring Apache. How to achieve the other five?
As I see it, there are three broad ways to achieve them.
<link rel="stylesheet" type="text/css" href="custom_handler.php?file1.css,file2.css" /> or something like that). It can also mean using mod_rewrite to direct incoming requests for CSS and Javascript to go to a PHP script. Either way, there is processing on every page load. Caching the end-product helps. Still, there must be a better way.Don’t have a build server? That’s a whole other topic.
In any system, the biggest bottlenecks will usually be related to I/O. What this means practically is two things:
But moving across the boundaries of memory, disk, and network is usually cumbersome. For example, storing things on disks is programmatically easy, but slow. Storing things in memory, in a persistent way, can be hard. This is more true for a shared-nothing architecture like PHP rather than Java, so you may have to deal with some shared memory libraries and SysV IPC-style calls.
Enter tmpfs, the linux shared-memory file system. You can mount it just like ext3, create files, and otherwise treat it like a normal disk, but it’s in memory! Awesome!
On RHEL, Fedora, CentOS – not sure about others – there is a tmpfs drive mounted under /dev/shm by default. One other note: since it is memory, its contents will be lost upon reboot. I usually re-create any directories I need in the /etc/rc.d/rc.local script. Note, however, that this is the last file to run on boot, so if you have a service or daemon that assumes a folder in /dev/shm, you will need to create it in the service’s startup script (usually in /etc/init.d).
When looking for something in an array of values, it is very tempting to use in_array(). After all, that’s what the name says. However, searching through an array, even with best-case search algorithms, will never be faster than a single index lookup, which is where isset() comes in. With isset(), you can use one operation to see if a value exists, provided those values exist as keys. I don’t know if it’s truly random access, but it’s pretty darn close.
So, instead of something like this:
1 2 3 4 5 6 7 8 9 | $exclude = array(1, 4, 6, 8); for ($i = 0, $size = count($data); $i < $size; $i++) { if (in_array($data[$i]['id'], $exclude) { // do something } } |
do something like:
1 2 3 4 5 6 7 8 9 10 11 12 | $exclude[1] = true; $exclude[4] = true; $exclude[6] = true; $exclude[8] = true; for ($i, $size = count($data); $i < $size; $i++) { if (isset($exclude[$data[$i]['id']])) { // do something } } |
So does this make a difference? Let’s write a little benchmark script.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | #!/usr/bin/php <?php $haystack = array(); for ($i = 0; $i < 1000; $i++) { $haystack[] = rand(0, 1000); } $needles = array(); for ($i = 0; $i < 1000; $i++) { $needles[] = rand(0, 1000); } for ($i = 0; $i < 1000; $i++) { foreach ($needles as $needle) { if (in_array($needle, $haystack)); } } |
We fill two arrays with 1000 random integers. One is the haystack – what we will search through. The other is the list of needles – we want to search for each one. For each needle, we look for it in the haystack. Then, we repeat this 1000 times.
Executing this, the script takes around 37 seconds:
% time ./bench.php real 0m37.400s user 0m37.282s sys 0m0.068s
Now, let’s change the last for() loop to this:
15 16 17 18 19 20 21 22 | for ($i = 0; $i < 1000; $i++) { $tmp = array_flip($haystack); foreach ($needles as $needle) { if (isset($tmp[$needle])); } } |
The new output:
% time ./bench.php real 0m0.778s user 0m0.764s sys 0m0.008s
Execution time drops from around 37 seconds to 0.7 seconds.
A while ago, we were struggling with the question of whether or not to use a framework with some new code. Specifically, did we want to use Zend Framework or not? (The reasons for settling on ZF vs. others is the topic of a different post.) We had been using our own little framework for almost a year, and it had served us relatively well.
Reasons to use ZF:
Reasons not to use ZF:
My #1 concern was with performance. So I ran some tests using Apache Bench and Zend Framework 1.0. I won’t concern you with the details of the test because 1.0 is now a bit outdated, and the performance of Zend Framework is not really the point of this post. But I will say that ZF was much slower than what we were currently using.
Then I got to thinking about all the websites out there. Zend is a very nice framework – perhaps my #1 choice for frameworks at the time (though that changes frequently). And getting “Hello World” to work was easy and enjoyable.
But different websites have different audiences. If I am making a local storefront website, or even a chain website, the traffic patterns are going to be very different from a website aiming to be a national destination. If I were cranking out storefront websites every month, I think ZF is the way to go. It’s quick, and I’m sure I would be doing similar things over and over again. ZF is very good for that. Same goes for a shrink-wrap application (by shrinkwrap, I mean something downloadable and installable). If I wanted to make the next great blogging platform, and thousands of people would come to my website just to download it and put on their own hosts, again, ZF is very good for that. (In fact, I would make the application in ZF, and the website to download it from in ZF!)
But for my company, it’s different. If we are to become that national destination that we want to be, it will need every ounce of performance squeezed out. It will have a very custom environment, and require very custom features. That is why Zend Framework, and other frameworks, fail.
When we talk about performance, it is the response time of a single request, web page, SQL query, etc. It is the actual execution time for something in the absence of load. To illustrate, suppose you wanted to test the performance of a web page using Apache Bench. You should run something like:
% ab -n 1000 -c 1 http://www.whatever.com
The -n is the number of requests and the -c is the number of concurrent requests. Since we’re interested in end-to-end response time, we only need one concurrent request. Scalability is usually about throughput, or the number of concurrent requests within a certain period of time. Using the example above, the Apache Bench command should be something like:
% ab -c 100 -t 60 http://www.whatever.com
The -t is the amount of time to run the test. We can vary -c until individual response times begin to grow, at which point something in the system has reached its maximum capacity.