Returning a list of array keys in MongoDB

Here's how I return a list of keys from an associative array (i.e. an embedded document) without also returning their value.

The data structure I have is shown below. It's made up of an associative array [friends], which contains a set of keys (Twitter usernames) and values (details about the user). My aim was to return a list of the Twitter usernames, without all of the details associated to them which - in some cases - could be quite a large amount of data.


        [friends] => Array (
            [RockSugarBand] => Array (
                [realName] => Rock Sugar
                ...
            )
            [twitterapi] => Array (
                [realName] => Twitter API
                ...
            )
            [biz] => Array (
                [realName] => Biz Stone
                ...
            )
        )
    

I solved this using MongoDB's 'Group' function. The JavaScript command is:


        db.users.group(
            {
                 cond: {_id:ObjectId('4c3c32c24b065884c53f35bb')},
                 initial: { results: [] },
                 reduce: function(obj,prev)
                 {
                    for(var key in obj.friends) {
                        prev.results.push(key);
                    }
                 }
            }
        );
    

What this essentially does is:

  • Line 3 - Here I specify the document I want to apply the group function to. In my case I just want to return a list of keys from a specific document, thus I identify it using its document id.
  • Line 4 - I create an empty array - results - into which the keys are added.
  • Line 5 - Here I start to define the reduce function. It takes two arguments.
    • obj - The current document (In my case there is only one).
    • prev - This object contains all the variables defined on Line 4. In this case that's just results.
  • Line 7 - Here I iterate over all the keys in the [friends] array. Note that in JavaScript the <var foo> in <array bar> structure results in foo containing the key.
  • Line 8 - I simply push the key - key - onto the results array.

...and that's it! The result looks like:


        Array (
            [results] => Array (
                "RockSugarBand",
                "twitterapi",
                "biz"
            )
        )
    

The reason that results is wrapped in an array is that the function will return one results array per Document iterated over. In my case that's just one, but had I used a condition that returned more documents, they would all be included within this wrapping array.

Doing this in PHP


        // Use all fields
        $keys = array();

        // Set initial values
        $initial = array("results" => array());

        // JavaScript function to perform
        $reduce = new MongoCode('function(obj,prev)
             {
                for(var key in obj.friends) {
                    prev.results.push(key);
                }
             }');

        // The condition on which a document must match in order to be processed
        $condition = array('_id' => $userDocId);

        // Execute the query
        $result = $collection->group($keys, $initial, $reduce, $condition);
    

Things to note:

  1. According to the MongoDB documentation, the group function does not work in a sharded environment.
  2. This query can't use any indexes, with the exception of finding the document set to iterate over, i.e. the query defined under "cond". For my problem this isn't likely to ever become an issue, but it might if you start iterating over a large number of documents.

This is just how I solved it. There may well be better solutions. If you know of one, please feel free to share it with us all. :-)