<?xml version="1.0" encoding="UTF-8"?> <rss
version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
> <channel><title>Straylight Run &#187; Performance</title> <atom:link href="http://blog.straylightrun.net/category/performance/feed/" rel="self" type="application/rss+xml" /><link>http://blog.straylightrun.net</link> <description>Software, Technology, PHP</description> <lastBuildDate>Mon, 07 Nov 2011 19:26:59 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.3.1</generator> <item><title>86% Of Writes Were For Statistics</title><link>http://blog.straylightrun.net/2010/09/30/86-of-writes-were-for-statistics/</link> <comments>http://blog.straylightrun.net/2010/09/30/86-of-writes-were-for-statistics/#comments</comments> <pubDate>Thu, 30 Sep 2010 20:19:00 +0000</pubDate> <dc:creator>gerard</dc:creator> <category><![CDATA[Coding]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[mysql]]></category> <category><![CDATA[mysqltuner]]></category> <category><![CDATA[query cache]]></category> <guid
isPermaLink="false">http://blog.straylightrun.net/2010/09/30/86-of-writes-were-for-statistics/</guid> <description><![CDATA[View counts, click counts, hit counts, traffic statistics… The need for analytics and reporting on web products is a must-have.&#160; Well, the easiest way to do that is to simply increment a database value each time.&#160; The problem is when those counts are coming in hundreds of times per second.&#160; Writes are the most expensive [...]]]></description> <content:encoded><![CDATA[<p>View counts, click counts, hit counts, traffic statistics… The need for analytics and reporting on web products is a must-have.&#160; Well, the easiest way to do that is to simply increment a database value each time.&#160; The problem is when those counts are coming in hundreds of times per second.&#160; Writes are the most expensive queries:</p><ol><li>Writes usually trigger updating an index.</li><li>If you’re using a MyISAM storage engine, the table-level locking can get out of hand.&#160;</li><li>Writes are not query-cacheable.</li></ol><p>After <a
href="http://blog.straylightrun.net/2010/09/29/mysql-slow-query-log-is-your-friend/">observing subpar write behavior</a>, I wanted to know just how many of our total writes were for updating statistics?</p><p>First, I ran <code><a
href="http://blog.mysqltuner.com/">mysqltuner</a></code>.</p><div
class="wp_syntax"><div
class="code"><pre class="sh" style="font-family:monospace;">% mysqltuner
...
[**] Reads / Writes: 93% / 7%
...
%</pre></div></div><p>So 7% of all queries were writes.&#160; That wasn’t bad.&#160; Then, I took the binary log of all DML statements for yesterday, starting at midnight.&#160; I figured 24 hours was a good sample.</p><div
class="wp_syntax"><div
class="code"><pre class="sh" style="font-family:monospace;">% mysqlbinlog  --start-date='2010-06-06 0' binary-log.000152 &gt; cow</pre></div></div><p>I grepped out DML lines, to get rid of the binary log stuff.</p><div
class="wp_syntax"><div
class="code"><pre class="sh" style="font-family:monospace;">% grep -i '^insert' cow &gt; cow2
% grep -i '^update' cow &gt;&gt; cow2</pre></div></div><p>I counted up lines that wrote to our stat tables.</p><div
class="wp_syntax"><div
class="code"><pre class="sh" style="font-family:monospace;">% wc -l cow2
24898 cow
% grep -i -c 'stat_' cow2
20880</pre></div></div><p>Doing the math: <code>20880 / 24898 = 0.86</code>. About 86% of all writes to our database were for statistics.&#160; Which wasn’t too surprising.&#160; Most web sites must store and log a lot of data to know where to improve and how users are using the site.</p><p>So what do we do?</p><p>That’s the subject of another post, but the short answer is that these writes can be batched somehow.&#160; Whether the queries are batched with some sort of write-through cache, or <a
href="http://activemq.apache.org/">job queues</a>, the database won’t suffer from constant write queries.</p> ]]></content:encoded> <wfw:commentRss>http://blog.straylightrun.net/2010/09/30/86-of-writes-were-for-statistics/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>MySQL Slow Query Log Is Your Friend</title><link>http://blog.straylightrun.net/2010/09/29/mysql-slow-query-log-is-your-friend/</link> <comments>http://blog.straylightrun.net/2010/09/29/mysql-slow-query-log-is-your-friend/#comments</comments> <pubDate>Wed, 29 Sep 2010 20:07:00 +0000</pubDate> <dc:creator>gerard</dc:creator> <category><![CDATA[Coding]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[mysql]]></category> <category><![CDATA[query cache]]></category> <category><![CDATA[query log]]></category> <guid
isPermaLink="false">http://blog.straylightrun.net/2010/09/29/mysql-slow-query-log-is-your-friend/</guid> <description><![CDATA[The MySQL Slow Query Log is a required tool in the database administrator’s toolbox.&#160; It’s great for troubleshooting specific issues, but it’s also great for some rainy day application tuning. My slow query log is in /var/lib/mysqld/db-001-slow.log and records any queries that take longer than 10 seconds (the default value for long_query_time). I can get [...]]]></description> <content:encoded><![CDATA[<p>The <a
href="http://dev.mysql.com/doc/refman/5.1/en/slow-query-log.html">MySQL Slow Query Log</a> is a required tool in the database administrator’s toolbox.&#160; It’s great for troubleshooting specific issues, but it’s also great for some rainy day application tuning.</p><p>My slow query log is in <code>/var/lib/mysqld/db-001-slow.log</code> and records any queries that take longer than 10 seconds (the default value for <code>long_query_time)</code>. I can get information out of this log using <code>mysqldumpslow</code>.</p><p>Running <code>`mysqldumpslow db-001-slow.log`</code> prints out slow queries sorted by descending execution time. But that’s not useful to me, because any query can get delayed by a blip in the system.</p><p>I like running <code>`mysqldumpslow -s c db-001-slow.log`</code> which prints out the slow queries sorted by descending count of times that query occurred. Optimizing a query that takes 10 seconds to execute but occurs a dozen times every minute will be more beneficial than optimizing the query that takes 140 seconds to execute but rarely occurs.</p><p>The first time I tried this exercise, I revealed the following 3 types of slow queries (can’t remember the exact order now):</p><ol><li>Queries with lots of logic and joins returning infrequently-changing data.</li><li>Queries using the <code>curdate()</code> function, <a
href="http://blog.straylightrun.net/2009/09/23/tip-of-the-day-avoid-mysql-functions/">which are not query cacheable</a>.</li><li>Queries to insert/update a stats table for content view counts.</li></ol><p>For #1, I used an <a
href="http://memcached.org/">in-memory cache</a> to cache the query results.&#160;&#160; For #2, I replaced the <code>curdate()</code> function with the PHP <code>date()</code> function everywhere I could find it.&#160; For #3, I noticed an extraneous index on the stats table, and indexes slow down inserts and updates, so I removed it.&#160; For more on handling these types of queries, see my <a
href="http://blog.straylightrun.net/2010/09/30/86-of-writes-were-for-statistics/">next post</a>.</p> ]]></content:encoded> <wfw:commentRss>http://blog.straylightrun.net/2010/09/29/mysql-slow-query-log-is-your-friend/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Validation Of Object Model</title><link>http://blog.straylightrun.net/2010/09/28/validation-of-object-model/</link> <comments>http://blog.straylightrun.net/2010/09/28/validation-of-object-model/#comments</comments> <pubDate>Tue, 28 Sep 2010 19:31:50 +0000</pubDate> <dc:creator>gerard</dc:creator> <category><![CDATA[Coding]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[business objects]]></category> <category><![CDATA[caching]]></category> <category><![CDATA[frameworks]]></category> <category><![CDATA[memcache]]></category> <category><![CDATA[memcached]]></category> <category><![CDATA[object model]]></category> <guid
isPermaLink="false">http://blog.straylightrun.net/2010/09/28/validation-of-object-model/</guid> <description><![CDATA[I was reading up on memcached, when I came across some validation of the Pox Framework object model.&#160; In the FAQ, a general design approach to storing lists of data is described. Storing lists of data into memcached can mean either storing a single item with a serialized array, or trying to manipulate a huge [...]]]></description> <content:encoded><![CDATA[<p>I was reading up on <a
href="http://memcached.org/">memcached</a>, when I came across some validation of the <a
href="http://www.straylightrun.net/pox-php:object_model">Pox Framework object model</a>.&#160; In the FAQ, a general design approach to <a
href="http://code.google.com/p/memcached/wiki/NewProgrammingTricks#Storing_sets_or_lists">storing lists of data is described</a>.</p><blockquote><p>Storing lists of data into memcached can mean either storing a single item with a serialized array, or trying to manipulate a huge “collection” of data by adding, removing items without operating on the whole set. Both should be possible.</p><p>One thing to keep in mind is memcached’s 1 megabyte limit on item size, so storing the whole collection (ids, data) into memcached might not be the best idea.</p><p>Steven Grimm explains a better approach on the mailing list: <a
href="http://lists.danga.com/pipermail/memcached/2007-July/004578.html">http://lists.danga.com/pipermail/memcached/2007-July/004578.html</a></p></blockquote><p>Following the link gives this quote:</p><blockquote><p>A better way to deal with this kind of thing is with a two-phase fetch. So instead of directly caching an array of event data, <strong>instead cache an array of event IDs</strong>. Query that list, then use it construct a list of the keys of individual event objects you want to fetch, then multi-get that list of keys.</p><p>…Another advantage of a scheme like this is that you can update an item’s data without having to read then write every list that contains that item. Just update it by ID (like you’d do in your database queries) and all the lists that contain it will magically get the correct information.</p></blockquote><p>That always feels nice.</p> ]]></content:encoded> <wfw:commentRss>http://blog.straylightrun.net/2010/09/28/validation-of-object-model/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>HTTP Keep-Alive</title><link>http://blog.straylightrun.net/2010/09/10/http-keep-alive/</link> <comments>http://blog.straylightrun.net/2010/09/10/http-keep-alive/#comments</comments> <pubDate>Fri, 10 Sep 2010 21:13:51 +0000</pubDate> <dc:creator>gerard</dc:creator> <category><![CDATA[Performance]]></category> <category><![CDATA[Sysadmin]]></category> <category><![CDATA[apache]]></category> <category><![CDATA[config]]></category> <category><![CDATA[headers]]></category> <category><![CDATA[http]]></category> <category><![CDATA[keep alive]]></category> <category><![CDATA[optimization]]></category> <category><![CDATA[subdomains]]></category> <category><![CDATA[virtual host]]></category> <guid
isPermaLink="false">http://blog.straylightrun.net/2010/09/10/http-keep-alive/</guid> <description><![CDATA[Like most people, I did not know much about HTTP Keep-Alive headers other than that they could be very bad if used incorrectly. So I’ve kept them off, which is the default. But I ran across this blog post which explains the HTTP Keep-Alive, including its benefits and potential pitfalls pretty clearly. It’s all pretty [...]]]></description> <content:encoded><![CDATA[<p>Like most people, I did not know much about <a
href="http://httpd.apache.org/docs/2.2/mod/core.html#keepalive">HTTP Keep-Alive headers</a> other than that they could be <em>very </em>bad if used incorrectly. So I’ve kept them off, which is the default. But I ran across this blog post which <a
href="http://virtualthreads.blogspot.com/2006/01/tuning-apache-part-1.html">explains the HTTP Keep-Alive</a>, including its benefits and potential pitfalls pretty clearly.</p><p>It’s all pretty simple really. There is an overhead to opening and closing TCP connections. To alleviate this, Apache can agree to provide persistent connections by sending HTTP Keep-Alive headers. Then the browser can open a single connection to download multiple resources. But Apache won’t know when the browser is done downloading, so it simply keeps the connection open according to a Keep-Alive timeout, which is set to 15 seconds by default. The problem is the machine can only keep so many simultaneous requests open due to physical limitations (e.g. RAM, CPU, etc.) And 15 seconds is a <em>long</em> time.</p><p>To allow browsers to gain some parallelism on downloading files, without keeping persistent connections open too long, the Keep-Alive timeout value should be set to something very low, e.g. 2 seconds.</p><p>I’ve done this for <em>static content only</em>. Why only static content? It doesn’t really make much sense for the main page source itself since that’s the page the user wants to view.</p><p>I’ve <a
href="http://blog.straylightrun.net/2008/11/16/slides-from-phpworks-2008-part-2/">mentioned before</a> that by serving all static content on dedicated subdomains, we indirectly get the benefit of being able to optimize just those subdomains. So far, this meant:</p><ol><li>disabling <code>.htaccess </code>files</li><li>setting a far-future Expires: header</li><li>avoiding setting cookies on the subdomain</li></ol><p>Now we can add to the list: enabling HTTP Keep-Alive headers. The <code>VirtualHost </code>block might look like this now:</p><div
class="wp_syntax"><div
class="code"><pre class="php" style="font-family:monospace;"><span style="color: #339933;">&lt;</span>VirtualHost <span style="color: #339933;">*:</span><span style="color: #cc66cc;">80</span><span style="color: #339933;">&gt;</span>
    ServerName      static0<span style="color: #339933;">.</span>yourdomain<span style="color: #339933;">.</span>com
    ServerAlias     static1<span style="color: #339933;">.</span>yourdomain<span style="color: #339933;">.</span>com
    ServerAlias     static2<span style="color: #339933;">.</span>yourdomain<span style="color: #339933;">.</span>com
    ServerAlias     static3<span style="color: #339933;">.</span>yourdomain<span style="color: #339933;">.</span>com
    DocumentRoot    <span style="color: #339933;">/</span><span style="color: #000000; font-weight: bold;">var</span><span style="color: #339933;">/</span>www<span style="color: #339933;">/</span>vhosts<span style="color: #339933;">/</span>yourdomain<span style="color: #339933;">.</span>com
    KeepAlive On
    KeepAliveTimeout <span style="color: #cc66cc;">2</span>
    <span style="color: #339933;">&lt;</span>Directory <span style="color: #0000ff;">&quot;/var/www/vhosts/yourdomain.com&quot;</span><span style="color: #339933;">&gt;</span>
        AllowOverride None
        ExpiresActive On
        ExpiresByType text<span style="color: #339933;">/</span>css <span style="color: #0000ff;">&quot;access plus 1 year&quot;</span>
        ExpiresByType application<span style="color: #339933;">/</span>x<span style="color: #339933;">-</span>javascript <span style="color: #0000ff;">&quot;access plus 1 year&quot;</span>
        ExpiresByType image<span style="color: #339933;">/</span>jpeg <span style="color: #0000ff;">&quot;access plus 1 year&quot;</span>
        ExpiresByType image<span style="color: #339933;">/</span>gif <span style="color: #0000ff;">&quot;access plus 1 year&quot;</span>
        ExpiresByType image<span style="color: #339933;">/</span>png <span style="color: #0000ff;">&quot;access plus 1 year&quot;</span>
    <span style="color: #339933;">&lt;/</span>Directory<span style="color: #339933;">&gt;</span>
<span style="color: #339933;">&lt;/</span>VirtualHost<span style="color: #339933;">&gt;</span></pre></div></div> ]]></content:encoded> <wfw:commentRss>http://blog.straylightrun.net/2010/09/10/http-keep-alive/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Clearing The Linux Buffer Cache</title><link>http://blog.straylightrun.net/2009/12/03/clearing-the-linux-buffer-cache/</link> <comments>http://blog.straylightrun.net/2009/12/03/clearing-the-linux-buffer-cache/#comments</comments> <pubDate>Thu, 03 Dec 2009 19:22:45 +0000</pubDate> <dc:creator>gerard</dc:creator> <category><![CDATA[Performance]]></category> <category><![CDATA[Sysadmin]]></category> <category><![CDATA[apache]]></category> <category><![CDATA[buffer cache]]></category> <category><![CDATA[disk]]></category> <category><![CDATA[logs]]></category> <category><![CDATA[mysql]]></category> <guid
isPermaLink="false">http://blog.straylightrun.net/2009/12/03/clearing-the-linux-buffer-cache/</guid> <description><![CDATA[According to these Munin memory graphs, the large orange area is the OS buffer cache – a buffer the OS uses to cache plain ol’ file data on disk.&#160; The graph below shows one of our web servers after we upgraded its memory.&#160; It makes sense that most of the memory not used by apps [...]]]></description> <content:encoded><![CDATA[<p>According to these <a
href="http://munin.projects.linpro.no/">Munin</a> memory graphs, the large orange area is the OS buffer cache – a buffer the OS uses to cache plain ol’ file data on disk.&#160; The graph below shows one of our web servers after we upgraded its memory.&#160;</p><p><img
style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="Web server memory usage" border="0" alt="Web server memory usage" src="http://blog.straylightrun.net/wp-content/uploads/2009/12/zsweb001memorymonth.png" width="495" height="408" /></p><p>It makes sense that most of the memory not used by apps would be used by the OS to improve disk access.&#160; So seeing the memory graphs filled with orange is generally a good thing.&#160; After a few days, I watched the orange area grow and thought, “Great!&#160; LInux is putting all that extra memory to use.”&#160; I thought in my head that maybe it was caching images and CSS files to serve to Apache.&#160; But was that true?</p><p><strong><u>Looking At A Different Server</u></strong></p><p>Here is a memory graph from one of our database servers after the RAM upgrade.</p><p><img
style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="Database server memory usage" border="0" alt="Database server memory usage" src="http://blog.straylightrun.net/wp-content/uploads/2009/12/zsdb001memorymonth.png" width="495" height="408" /></p><p>Again, I first thought that the OS was caching all that juicy database data from disk.&#160; The problem is that we don’t have 12GB of data, and that step pattern growth was suspiciously consistent.</p><p>Looking again at the web server graph, I saw giant downward spikes of blue color, where the buffer cache was emptied.&#160; (The blue is unused memory.)&#160; These occurred every day at 4 am, and on Sundays there’s a huge one.&#160; What happens every day at 4 am?&#160; The logs are rotated.&#160; And on Sundays, the granddaddy log of them all – the Apache log – is rotated.</p><p><strong><u>The Problem</u></strong></p><p>It was starting to make sense.&#160; Log files seem to take up most of the OS buffer cache on the web servers.&#160; Not optimal, I’m sure.&#160; And when they’re rotated, the data in the cache is invalidated and thus freed.</p><p>Here is a memory graph for one of our other database servers.</p><p> <img
style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="Database server memory usage" border="0" alt="Database server memory usage" src="http://blog.straylightrun.net/wp-content/uploads/2009/12/zsdb002memorymonth.png" width="495" height="408" /></p><p>That step pattern growth is missing!&#160; In fact, most of RAM is unused.&#160; What is the difference between the first database server and this one?&#160; The first has the <code>`mysqldump`</code> backup.&#160; It occurs every night at 2:30 am, right when those step changes occur on its memory usage graph.</p><p>It was clear to me that most of the OS buffer cache was wasted on logs and backups and such.&#160; There had to be a way to tell the OS not to cache a file.&#160;</p><p><strong><u>The Solution</u></strong></p><p>Google gave me this page: <a
href="http://insights.oetiker.ch/linux/fadvise.html">Improving Linux performance by preserving Buffer Cache State</a>.&#160; I copied the little C program into a file and ran it on all the <code>`mysqldump`</code> backups.&#160; Here is the what happened to the memory usage.</p><p><img
style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="Database server memory usage" border="0" alt="Database server memory usage" src="http://blog.straylightrun.net/wp-content/uploads/2009/12/zsdb001memoryweek.png" width="495" height="408" /></p><p>Quite a bit of buffer cache was freed.&#160; On that night’s backup, I logged the buffer cache size before the backup and after.</p><div
class="wp_syntax"><div
class="code"><pre class="sh" style="font-family:monospace;">% cat 2008.08.21.02.30.log
Starting at Thu Aug 21 02:30:03 EDT 2008
=========================================
Cached:        4490232 kB
Cached:        5350908 kB
=========================================
Ending at Thu Aug 21 02:30:55 EDT 2008</pre></div></div><p>Just under a gigabyte increase in buffer cache size.&#160; What was the size of the new backup file?</p><div
class="wp_syntax"><div
class="code"><pre class="sh" style="font-family:monospace;">% ll 2008.08.21.02.30.sql
-rw-r--r-- 1 root root 879727872 Aug 21 02:30 2008.08.21.02.30.sql</pre></div></div><p>About 900MB.</p><p><strong><u>Did It Work?</u></strong></p><p>I used the C program on that page to ensure no database backups were cached by the OS.&#160; I did the same on the web servers in the <code>logrotate</code> config files.&#160; A couple days later, I checked the memory graph on the database server that performed the backup.&#160; Notice how the buffer cache did not fill up.&#160; It looked like the program worked, and the OS was free to cache more important things.</p><p><img
style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="Database server memory usage" border="0" alt="Database server memory usage" src="http://blog.straylightrun.net/wp-content/uploads/2009/12/zsdb001memoryweek2.png" width="495" height="408" /></p> ]]></content:encoded> <wfw:commentRss>http://blog.straylightrun.net/2009/12/03/clearing-the-linux-buffer-cache/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Tip Of The Day: Avoid MySQL Functions</title><link>http://blog.straylightrun.net/2009/09/23/tip-of-the-day-avoid-mysql-functions/</link> <comments>http://blog.straylightrun.net/2009/09/23/tip-of-the-day-avoid-mysql-functions/#comments</comments> <pubDate>Wed, 23 Sep 2009 18:45:00 +0000</pubDate> <dc:creator>gerard</dc:creator> <category><![CDATA[Coding]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[mysql]]></category> <category><![CDATA[query cache]]></category> <category><![CDATA[tips]]></category> <guid
isPermaLink="false">http://blog.straylightrun.net/2009/09/23/tip-of-the-day-avoid-mysql-functions/</guid> <description><![CDATA[Since I knew that the MySQL Query Cache used the literal queries as keys, it made sense that MySQL did not cache queries with certain SQL functions in them, such as this one: 1 $sql = &#34;select event_id from events where event_dt &#62;= curdate()&#34;; Because MySQL knows that this query run today is not the [...]]]></description> <content:encoded><![CDATA[<p><a
href="http://blog.straylightrun.net/2009/03/13/mysql-query-cache-or-vertical-partitioning-intro/">Since I knew</a> that the MySQL Query Cache used the literal queries as keys, it made sense that MySQL did not cache queries with certain SQL functions in them, such as this one:</p><div
class="wp_syntax"><table><tr><td
class="line_numbers"><pre>1
</pre></td><td
class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$sql</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">&quot;select event_id from events where event_dt &gt;= curdate()&quot;</span><span style="color: #339933;">;</span></pre></td></tr></table></div><p>Because MySQL knows that this query run today is not the same query when it is run tomorrow. There are other SQL functions such as <code>rand()</code> and <code>unix_timestamp()</code> that will bypass the query cache. These are <a
href="http://dev.mysql.com/doc/refman/5.0/en/query-cache-how.html">listed here</a>.</p><p>So I avoid these functions when possible by calculating the value in PHP. For example, I’d rewrite the above query as:</p><div
class="wp_syntax"><table><tr><td
class="line_numbers"><pre>1
2
</pre></td><td
class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$date</span> <span style="color: #339933;">=</span> <span style="color: #990000;">date</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'Y-m-d'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000088;">$sql</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">&quot;select event_id from events where event_dt &gt;= '<span style="color: #006699; font-weight: bold;">$date</span>'&quot;</span><span style="color: #339933;">;</span></pre></td></tr></table></div> ]]></content:encoded> <wfw:commentRss>http://blog.straylightrun.net/2009/09/23/tip-of-the-day-avoid-mysql-functions/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Tip Of The Day: Removing .htaccess</title><link>http://blog.straylightrun.net/2009/06/17/tip-of-the-day-removing-htaccess/</link> <comments>http://blog.straylightrun.net/2009/06/17/tip-of-the-day-removing-htaccess/#comments</comments> <pubDate>Wed, 17 Jun 2009 20:48:21 +0000</pubDate> <dc:creator>gerard</dc:creator> <category><![CDATA[Performance]]></category> <category><![CDATA[Sysadmin]]></category> <category><![CDATA[apache]]></category> <category><![CDATA[htaccess]]></category> <category><![CDATA[tips]]></category> <guid
isPermaLink="false">http://blog.straylightrun.net/?p=205</guid> <description><![CDATA[At work, every project has an .htaccess file containing at the least some mod_rewrite rules.  This way, all I need to do to run a project is check it out of version control.  I don&#8217;t need to modify my local Apache configuration. But turning this option on and allowing .htaccess files may be a performance [...]]]></description> <content:encoded><![CDATA[<p>At work, every project has an <code>.htaccess</code> file containing at the least some <code>mod_rewrite </code>rules.  This way, all I need to do to run a project is check it out of version control.  I don&#8217;t need to modify my local Apache configuration.</p><p>But turning this option on and allowing <code>.htaccess</code> files may be a performance hit.  More specifically, enabling the <code><a
href="http://httpd.apache.org/docs/2.2/mod/core.html#allowoverride">AllowOverride</a> </code>option in Apache is a performance hit.  The <a
href="http://httpd.apache.org/docs/2.2/misc/perf-tuning.html">Apache docs</a> sums up the problem best:</p><blockquote><p> &#8220;Wherever in your URL-space you allow overrides (typically <code>.htaccess</code> files) Apache will attempt to open <code>.htaccess</code> for each filename component. For example,</p><div
class="wp_syntax"><table><tr><td
class="line_numbers"><pre>1
2
3
4
</pre></td><td
class="code"><pre class="xml" style="font-family:monospace;">DocumentRoot /www/htdocs
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;Directory</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
   AllowOverride all
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/Directory<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></td></tr></table></div><p>and a request is made for the URI <code>/index.html</code>. Then Apache will attempt to open <code>/.htaccess</code>, <code>/www/.htaccess</code>, and <code>/www/htdocs/.htaccess</code>.&#8221;</p></blockquote><p>So I disabled all <code>.htaccess </code>files in production, and inserted each file&#8217;s individual <code>mod_rewrite </code>rules into the main Apache config file. After a quick <a
href="http://blog.straylightrun.net/2009/04/23/apache-bench/">Apache Bench</a> run, one project looked around 3% faster. Note that there are a few other useful optimizations on that page.</p> ]]></content:encoded> <wfw:commentRss>http://blog.straylightrun.net/2009/06/17/tip-of-the-day-removing-htaccess/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> <item><title>httperf: Yet Another Website Load Testing Tool</title><link>http://blog.straylightrun.net/2009/04/23/httperf-yet-another-website-load-testing-tool/</link> <comments>http://blog.straylightrun.net/2009/04/23/httperf-yet-another-website-load-testing-tool/#comments</comments> <pubDate>Thu, 23 Apr 2009 20:25:41 +0000</pubDate> <dc:creator>gerard</dc:creator> <category><![CDATA[Performance]]></category> <category><![CDATA[httperf]]></category> <category><![CDATA[load]]></category> <category><![CDATA[scalability]]></category> <category><![CDATA[throughput]]></category> <guid
isPermaLink="false">http://blog.straylightrun.net/?p=142</guid> <description><![CDATA[I&#8217;ve mentioned Apache Bench before. Httperf serves the same purpose as ab, but has a few more features, and has one very nice value-add. While ab cannot really simulate a user visiting a website and performing multiple requests, httperf can. You can feed it a number of URL&#8217;s to visit, and specify how many requests [...]]]></description> <content:encoded><![CDATA[<p>I&#8217;ve mentioned <a
href="http://blog.straylightrun.net/2009/04/23/apache-bench/">Apache Bench</a> before. <a
href="http://www.hpl.hp.com/research/linux/httperf/">Httperf </a>serves the same purpose as <code>ab</code>, but has a few more features, and has one very nice value-add.</p><p>While ab cannot really simulate a user visiting a website and performing multiple requests, httperf can. You can feed it a number of URL&#8217;s to visit, and specify how many requests to send within one session. You can also spread out requests over a time period randomly according a uniform or Poisson distribution, or a constant.</p><p>But the big value-add is <a
href="http://www.xenoclast.org/autobench/">autobench</a>. Autobench is a perl wrapper around httperf for automating the process of load testing a web server. Autobench runs httperf a specified number of times against a URI, increasing the number of requests per second (which I equate to <code>-c</code> in ab) so that the response rate or the response time can be graphed vs. requests per second. (So response rate or response time on the vertical, and requests per second on the horizontal.)</p><p>With this, you can generate pretty graphs like this:</p><p><a
href="http://blog.straylightrun.net/wp-content/uploads/2009/04/httperf-requests-sec.png"><img
style="border-width: 0px;" src="http://blog.straylightrun.net/wp-content/uploads/2009/04/httperf-requests-sec-thumb.png" border="0" alt="Requests per sec" width="399" height="280" /></a></p><p><a
href="http://blog.straylightrun.net/wp-content/uploads/2009/04/httperf-response-time.png"><img
style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" src="http://blog.straylightrun.net/wp-content/uploads/2009/04/httperf-response-time-thumb.png" border="0" alt="Response time" width="399" height="280" /></a></p><p>From the graphs above, you could determine the approximate capacity of your website. In the first graph, the number of responses received was equal to the number of requests sent until 16 req/sec. At 16 req/sec., the number of responses starts going <em>down</em> as requests begin to error out.  In the second graph, the response time stays level at about 500ms (a reflection of your code and database) until 15 req/sec.  At 16 req/sec. the time goes up to nearly 1s, and at 17 req/sec. the response time is over a second.  You would conclude that the capacity of this website is around 15 requests per second.</p><p>The people who provide autobench also offer an <a
href="http://www.xenoclast.org/doc/benchmark/HTTP-benchmarking-HOWTO/HTTP-benchmarking-HOWTO.html">excellent HOWTO on benchmarking web servers in general</a>.</p> ]]></content:encoded> <wfw:commentRss>http://blog.straylightrun.net/2009/04/23/httperf-yet-another-website-load-testing-tool/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> <item><title>Apache Bench</title><link>http://blog.straylightrun.net/2009/04/23/apache-bench/</link> <comments>http://blog.straylightrun.net/2009/04/23/apache-bench/#comments</comments> <pubDate>Thu, 23 Apr 2009 14:22:53 +0000</pubDate> <dc:creator>gerard</dc:creator> <category><![CDATA[Performance]]></category> <category><![CDATA[apache bench]]></category> <category><![CDATA[load]]></category> <category><![CDATA[scalability]]></category> <category><![CDATA[throughput]]></category> <guid
isPermaLink="false">http://blog.straylightrun.net/?p=135</guid> <description><![CDATA[Apache Bench is either the first or second most useful PHP tool (with Xdebug being the other). I described the basic theory of Apache Bench in an earlier post. That&#8217;s a short post, so I won&#8217;t repeat it. This will be another short post, with a small note on how I use it day-to-day. If [...]]]></description> <content:encoded><![CDATA[<p><a
href="http://httpd.apache.org/docs/2.2/programs/ab.html">Apache Bench</a> is either the first or second most useful PHP tool (with <a
href="http://blog.straylightrun.net/2009/02/23/xdebug/">Xdebug </a>being the other). I described the basic theory of Apache Bench in <a
href="http://blog.straylightrun.net/2008/11/25/performance-scalability/">an earlier post</a>. That&#8217;s a short post, so I won&#8217;t repeat it. This will be another short post, with a small note on how I use it day-to-day. If you are changing something in the system, a piece of code, a database setting, an OS setting&#8230; anything! for performance reasons, and you want to see if it makes any difference, use Apache Bench. Fire up a quick test before the change, and after the change. ab runs very quickly (on the order of a few minutes on a slow machine), so you can run 1000 requests and not have to worry about your sample size. I even run it on my laptop. Even though my laptop introduces a lot of noise, it still gives relative results. I usually run it two ways before the change, and two ways after.</p><div
class="wp_syntax"><div
class="code"><pre class="php" style="font-family:monospace;"><span style="color: #339933;">%</span> ab <span style="color: #339933;">-</span>n <span style="color: #cc66cc;">1000</span> <span style="color: #339933;">-</span>c <span style="color: #cc66cc;">1</span> http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.whatever.com</span></pre></div></div><p>That usually gets me a good idea of improving <em>performance</em>.</p><div
class="wp_syntax"><div
class="code"><pre class="php" style="font-family:monospace;"><span style="color: #339933;">%</span> ab <span style="color: #339933;">-</span>c <span style="color: #cc66cc;">100</span> <span style="color: #339933;">-</span>t <span style="color: #cc66cc;">60</span> http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.whatever.com</span></pre></div></div><p>That usually gets me a good idea of <em>scaling </em>under load.</p><p><strong>UPDATE: </strong>There have been reports that <a
href="http://paul-m-jones.com/?p=413">Apache Bench is not reliable</a>.</p> ]]></content:encoded> <wfw:commentRss>http://blog.straylightrun.net/2009/04/23/apache-bench/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>MySQL Query Cache, Or Vertical Partitioning Intro</title><link>http://blog.straylightrun.net/2009/03/13/mysql-query-cache-or-vertical-partitioning-intro/</link> <comments>http://blog.straylightrun.net/2009/03/13/mysql-query-cache-or-vertical-partitioning-intro/#comments</comments> <pubDate>Fri, 13 Mar 2009 14:12:05 +0000</pubDate> <dc:creator>gerard</dc:creator> <category><![CDATA[Coding]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[mysql]]></category> <category><![CDATA[partitioning]]></category> <category><![CDATA[query cache]]></category> <guid
isPermaLink="false">http://blog.straylightrun.net/2009/03/13/mysql-query-cache-or-vertical-partitioning-intro/</guid> <description><![CDATA[The MySQL Query Cache is not very hard to understand. It is at its most basic a giant hash where the literal queries are the keys and the array of result records are the values. So this query: SELECT event_name FROM events WHERE event_id = 8; is different from this query: SELECT event_name FROM events [...]]]></description> <content:encoded><![CDATA[<p>The MySQL Query Cache is not very hard to understand. It is at its most basic a giant hash where the <em>literal queries </em>are the keys and the array of result records are the values. So this query:</p><div
class="wp_syntax"><div
class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">SELECT</span> event_name <span style="color: #993333; font-weight: bold;">FROM</span> events <span style="color: #993333; font-weight: bold;">WHERE</span> event_id <span style="color: #66cc66;">=</span> <span style="color: #cc66cc;">8</span>;</pre></div></div><p>is different from this query:</p><div
class="wp_syntax"><div
class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">SELECT</span>  event_name <span style="color: #993333; font-weight: bold;">FROM</span> events <span style="color: #993333; font-weight: bold;">WHERE</span> event_id <span style="color: #66cc66;">=</span> <span style="color: #cc66cc;">10</span>;</pre></div></div><p>Important note!&nbsp; This means that even though your parameterized queries may look the same without the parameters, to the query cache, they are not!</p><p>As with all caches, the query cache is concerned about freshness of data. It takes perhaps the simplest approach possible to this problem by keeping track of any tables involved in your cached query. If <em>any </em>of these tables changes, it invalidates the query and removes it from the cache. This means that if your query returns frequently-changing data in its results, the query cache will invalidate the query frequently, leading to thrashing. For example, if you had a query that returned a view count of an event:</p><div
class="wp_syntax"><div
class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">SELECT</span> event_name<span style="color: #66cc66;">,</span> views <span style="color: #993333; font-weight: bold;">FROM</span> events <span style="color: #993333; font-weight: bold;">WHERE</span> event_id <span style="color: #66cc66;">=</span> <span style="color: #cc66cc;">8</span>;</pre></div></div><p>Every time that event is viewed, the cached query will be invalidated. What&#8217;s the solution?</p><p>In general, write queries so that their result sets do not change often. In specific, mixing static attributes with frequently updated fields in a single table leads to thrashing, so separate out things like view counts and analytics into their own tables. The frequently updated data can be read with a separate query, or perhaps cached in your application in a data structure that periodically flushes to the DB.</p><p>This <a
href="http://en.wikipedia.org/wiki/Partition_(database)">vertical partitioning</a><em> </em>of a single table&#8217;s columns into multiple tables helps immensely with the query cache. What&#8217;s more is that the table with the unchanging data can be further optimized for READS, and the frequently updated table can be optimized for UPDATES.</p> ]]></content:encoded> <wfw:commentRss>http://blog.straylightrun.net/2009/03/13/mysql-query-cache-or-vertical-partitioning-intro/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> </channel> </rss>
