Skip to content

Attempt at removing locks #14

@mikeytag

Description

@mikeytag

@nicktacular

First off, thank you for creating the MongoSession class. It really helped me out on a project. I did something a bit crazy to it, so I didn't want to issue a Pull Request but I wanted to let you know how I modified your class so that locks aren't needed at all.

As you know, PHP does a read lock by default for the entire life of the script that called session_start(). The downside to this is if you ever fire off multiple script invocations for the same user at the same time. In the old days this only happened with framesets but we run into slowdowns due to multiple async ajax calls all the time. I decided to remove this paradigm completely and I want to let you know how I did it.

I first removed any reference to the lock method. This was really only in the read method anyway. I changed the session.serialize_handler to php_serialize. I'll explain why in just a bit. Next I changed the write method to the following:

     * Save the session data.
     * @param  string $sid The session ID that PHP passes.
     * @param  string $data The session serialized data string.
     * @return boolean True always.
     */
    public function write($sid, /*string*/
                          $data) {
        //update/insert our session data
        $this->sessionDoc = $this->sessions->findOne(array('_id' => $sid));
        if (!$this->sessionDoc) {
            //print "COUDN'T FIND SID: $sid<p>";
            $this->sessionDoc = array();
            $this->sessionDoc['_id'] = $sid;
            $this->sessionDoc['started'] = new MongoDate();
        }

        //there could have been a session regen so we need to be careful with the $sid here and set it anyway
        if ($this->sid != $sid) {
            //set the new one
            $this->sid = $sid;

            //and also make sure we're going to write to the correct document
            $this->sessionDoc['_id'] = $sid;
        }

        //loop through the session array and only store things where now is more recent
        $session_data_array = unserialize($this->sessionDoc['data']->bin);
        //print "<pre>DATA: $data\n\n\nSESSION DATA ARRAY: ".print_r($session_data_array, true);
        $data_array = unserialize($data);
        //print "DATA ARRAY: ".print_r($data_array, true);
        $something_changed = false;
        foreach ($data_array as $key => $val) {
            if (strpos($key, '_SMT') !== false) {
                //ignore the meta timestamp keys
                continue;
            }
            if (!isset($session_data_array[$key])) {
                //we are storing something new so just write it with a timestamp entry
                $session_data_array[$key] = $val;
                $session_data_array[$key.'_SMT'] = $_SERVER['REQUEST_TIME_FLOAT'];
                $something_changed = true;
            } else {
                //ok there is an existing key
                if (serialize($session_data_array[$key]) != serialize($val)) {
                    //session key val has changed so update only if the script that wrote it
                    //is older than the microtime of when our script started
                    if (!isset($session_data_array[$key.'_SMT'])) {
                        //for whatever reason there is no timestamp for this entry
                        //so assume that we are the winner here
                        $session_data_array[$key] = $val;
                        $session_data_array[$key.'_SMT'] = $_SERVER['REQUEST_TIME_FLOAT'];
                        $something_changed = true;
                    } else {
                        //there is a timestamp value so check to make sure that we are newer
                        if ($session_data_array[$key.'_SMT'] < $_SERVER['REQUEST_TIME_FLOAT']) {
                            $session_data_array[$key] = $val;
                            $session_data_array[$key.'_SMT'] = $_SERVER['REQUEST_TIME_FLOAT'];
                            $something_changed = true;
                        }
                    }
                }
            }
        }

        //now we need to loop through $session_data_array to see if any keys have been deleted
        foreach ($session_data_array as $key => $val) {
            if (strpos($key, '_SMT') !== false) {
                //ignore the meta timestamp keys
                continue;
            }
            if (!isset($data_array[$key])) {
                //the key is no longer in our memory data array
                if (!isset($session_data_array[$key.'_SMT'])) {
                    //for whatever reason there is no timestamp for this entry
                    //so assume that we are the winner here
                    unset($session_data_array[$key]);
                    $something_changed = true;
                } else {
                    //there is a timestamp value so check to make sure that we are newer
                    //delete only if the script that wrote it
                    //is older than the microtime of when our script started
                    if ($session_data_array[$key.'_SMT'] < $_SERVER['REQUEST_TIME_FLOAT']) {
                        unset($session_data_array[$key]);
                        unset($session_data_array[$key.'_SMT']);
                        $something_changed = true;
                    }
                }
            }
        }

        //print "FINAL SESSION ARRAY: ".print_r($session_data_array, true);

        if ($something_changed) {
            //print "SOMETHING CHANGED!";
            $this->sessionDoc['last_accessed'] = new MongoDate();
            $this->sessionDoc['data'] = new MongoBinData(serialize($session_data_array), MongoBinData::BYTE_ARRAY);

            //print "sessionDoc: ".print_r($this->sessionDoc, true);

            $this->sessions->save($this->sessionDoc, $this->getConfig('write_options'));
        } else {
            //print "NOTHING CHANGED!";
        }
        return true;
    }

Basically on write() I hit the mongo db for the session data and unserialize it into an array called $session_data_array. I then take the $data string and unserialize it into $data_array. I then proceed to diff the in memory $data_array vs. the last entry in mongo ($session_data_array) the following way.

If there is an entry in $data_array that is not set in $session_data_array then go ahead and create the key in $session_data_array as well as a microtime timestamp from when the script was first invoked. Since 5.4 this is available in $_SERVER['REQUEST_TIME_FLOAT'] (I know you are trying to support 5.2 and I could've created a property called startMicrotime and set it to microtime(true) in __construct, but I was a bit lazy.

Ok, if there is an entry in $data_array that is set in $session_data_array first check to see if it is different than the same value that is already in $session_data_array. If it is different then it looks for the timestamp of the last script that updated that key. If the timestamp of the running script (i.e. when it started) is newer than the timestamp on the session key then it updates that key value and it's corresponding $key.'_SMT' to be the new timestamp. (_SMT was a suffix I thought wouldn't collide much with apps and stands for Session Micro Time)

Last but not least, it loops through the $data array looking for the removal of any keys (i.e. you did an unset($_SESSION['foo'])) It does the similar timestamp check and if this script is newer than it wins.

After all is said and done it only updates mongo if, in fact, something in the session data has changed. If it hasn't it doesn't waste the network round trip by saving the same data over itself.

Cons to this approach:

  • For every key in the $_SESSION array there is a corresponding $key.'_SMT' along with a float for the microtime. This adds a fair amount of bloat to what's in the session. However, since sessions don't tend to massively huge it may be a fair tradeoff for the no lock functionality. Each app is different, your results may vary.
  • Using unserialize() and serialize() instead of the default session_decode and session_encode adds a fair amount of bloat but for our app and purposes we were willing to do it.

You may be asking yourself why didn't he just use session_decode() and session_encode()? The problem is that session_decode only returns a boolean and actually manipulates $_SESSION. I can't have the write method potentially jacking up that super global so I wanted to be able to decode and encode it to my own variables without touching $_SESSION and having inadvertent side effects.

I also set about on this journey because after using MongoSession in production for a day I noticed stale locks in the locks table. I also had users report to me that they couldn't login and in fact I was able to figure out that they were stuck in the immutable lock problem that you referenced in your README I believe.

My fork is located at https://github.com/mikeytag/php-mongo-session if you want to browse around and see the other changes I made in the file (had to add an ini_set at the top to change the session handler. My fork is now in production and will receive about 10,000 visitors each day. I'll report back with any issues that may arise. Like I said before, I am going out on a bit of limb here and I am confident in the way my app uses sessions that everything will be fine, but I realize that pulling in my changes to your repo may not be the best course of action as it's a pretty drastic change.

TL;DR: $_SESSION key level locking, saving of writes to Mongo, bloat of $_SESSION array and more.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions