When we talk about performance, it is the response time of a single request, web page, SQL query, etc. It is the actual execution time for something in the absence of load. To illustrate, suppose you wanted to test the performance of a web page using Apache Bench. You should run something like:

% ab -n 1000 -c 1 http://www.whatever.com


The -n is the number of requests and the -c is the number of concurrent requests. Since we’re interested in end-to-end response time, we only need one concurrent request. Scalability is usually about throughput, or the number of concurrent requests within a certain period of time. Using the example above, the Apache Bench command should be something like:

% ab -c 100 -t 60 http://www.whatever.com


The -t is the amount of time to run the test. We can vary -c until individual response times begin to grow, at which point something in the system has reached its maximum capacity.