Archive for September, 2009

We’ve never had long-lived sessions.  It was never a requirement.  I think we had a “Remember me” checkbox that didn’t work at one point, but we soon removed it.  But suddenly, customer requests started coming in.  They asked, “why do I have to log in every time I use the site?  Why can’t I stay logged in forever, like Facebook or Twitter?”  That was a good question.

Basic User Login

Like most sites, we used the PHP session to maintain a logged in user for our site.   We started a session, kept track of some data indicating if the user is logged in or not, and that was about it.

I never looked at sessions and cookies in-depth before.  I knew generally how sessions worked.  PHP sets a cookie in the client’s browser.  The cookie contains a session ID.  When a request comes in, PHP reads the session ID, looks for a file corresponding to the ID on disk (or in a database, memcached, etc.), reads in the file containing the session data, and loads the session into the request.  When the request finishes, the session data is saved to the file again.

Implementing The “Remember Me” Checkbox

First, naively, I thought all I had to do was find the right php.ini directive to make sessions last forever.  Browsing the PHP manual and googling, I came across the session.cookie_lifetime directive, configured in either php.ini or by session_set_cookie_params().

session.cookie_lifetime specifies the lifetime of the cookie in seconds which is sent to the browser. The value 0 means “until the browser is closed.” Defaults to 0.

I set this to 24 hours.  Well, that was easy, I thought.

Except it didn’t work.  Users reported logging in, going out to lunch, coming back, and getting logged out on the first link clicked.  I dug deeper and found another directive.

session.gc_maxlifetime specifies the number of seconds after which data will be seen as ‘garbage’ and cleaned up. Garbage collection occurs during session start.

It defaults to 1440 seconds, or 24 mins.

It’s important to know that session.cookie_lifetime starts when the cookie is set, regardless of last user activity, so it is an absolute expiration time.  session.gc_maxlifetime starts from when the user was last active (clicked), so it’s more like a maximum idle time.

Starting To Understand

Now I could see that both of these directives must cooperate to get the desired effect. Specifically, the shorter of these two values determines my session duration.

For example, let’s say I have session.cookie_lifetime set to its default of 0, and session.gc_maxlifetime is set to its default of 24 mins.  A user who logs in can stay logged in forever, provided he never closes his browser, and he never stops clicking for more than 24 mins.

Now, let’s say the same user takes a 30 min. lunch break, and leaves his browser open.  When, he gets back, he’ll most likely have been logged out because his session data was garbage collected on the server, even though his browser cookie was still there.

Now, let’s change session.cookie_lifetime to 1 hour.  A user who logs in can stay logged in for up to an hour if he clicks away for the whole time.  This is regardless of whether or not he closes/reopens his browser.  If he takes his 30 min. lunch break after working for 15 mins. he will most likely be logged out when he returns, even though his browser cookie had 15 more mins. of life.

Now, keeping session.cookie_lifetime at 1 hour, let’s set session.gc_maxlifetime to 2 hours.  A user who logs in can stay logged in for up to an hour, period.  He does not have to click at all in that time, but he’ll be logged out after an hour.

The Real “Remember Me” Solution

Back to my problem.  At this point, I could’ve just set both directives to something  like 1 year.  But since session.gc_maxlifetime controls garbage collection of session data, I’d have session data up to a year old left on the server!  I did a quick check on the PHP session directory.  There were already several thousand sessions, and that was only for a 24-minute lifetime!

Clearly, this was not how Twitter did it.  A little more digging, and I realized that sites like those do not keep your specific session around for long periods of time.  What they do is set a long-lasting cookie that contains some sort of security token.  From that token, they can authenticate you, and re-create your session, even if your session data has already been removed from the server.  (The cookie name for Twitter is auth_token and looks to have a lifetime of 20 years.)

With the session recreation method, I could control when and how to log out users, if at all.  So this enabled us to give users indefinite sessions, while keeping all session directives at their default values.

Beyond Session Cookies

This only scratches the surface of authentication topics of course.  We didn’t talk about security implications of the session re-creation method, though I will say that the best security practice against session-based attacks seems to prompt for a password if the user attempts to change or view sensitive account information.  LinkedIn is the first example that comes to mind.

Shortly after implementing this, a request came down from high above to centralize the authentication for our multiple products.  I began to investigate single sign-on (like Google accounts) and federated identity (like OpenID), but those are topics of another post.

Here are a couple blogs that got me on my way to the final solution. Be sure to read the comments:

Since I knew that the MySQL Query Cache used the literal queries as keys, it made sense that MySQL did not cache queries with certain SQL functions in them, such as this one:

$sql = "select event_id from events where event_dt >= curdate()";

Because MySQL knows that this query run today is not the same query when it is run tomorrow. There are other SQL functions such as rand() and unix_timestamp() that will bypass the query cache. These are listed here.

So I avoid these functions when possible by calculating the value in PHP. For example, I’d rewrite the above query as:

$date = date('Y-m-d');
$sql = "select event_id from events where event_dt >= '$date'";