This file is indexed.

/usr/share/doc/python-pymongo-doc/html/_sources/examples/aggregation.txt is in python-pymongo-doc 2.6.3-1build1.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
Aggregation Examples
====================

There are several methods of performing aggregations in MongoDB.  These
examples cover the new aggregation framework, using map reduce and using the
group method.

.. testsetup::

  from pymongo import MongoClient
  client = MongoClient()
  client.drop_database('aggregation_example')

Setup
-----
To start, we'll insert some example data which we can perform
aggregations on:

.. doctest::

  >>> from pymongo import MongoClient
  >>> db = MongoClient().aggregation_example
  >>> db.things.insert({"x": 1, "tags": ["dog", "cat"]})
  ObjectId('...')
  >>> db.things.insert({"x": 2, "tags": ["cat"]})
  ObjectId('...')
  >>> db.things.insert({"x": 2, "tags": ["mouse", "cat", "dog"]})
  ObjectId('...')
  >>> db.things.insert({"x": 3, "tags": []})
  ObjectId('...')

Aggregation Framework
---------------------

This example shows how to use the
:meth:`~pymongo.collection.Collection.aggregate` method to use the aggregation
framework.  We'll perform a simple aggregation to count the number of
occurrences for each tag in the ``tags`` array, across the entire collection.
To achieve this we need to pass in three operations to the pipeline.
First, we need to unwind the ``tags`` array, then group by the tags and
sum them up, finally we sort by count.

As python dictionaries don't maintain order you should use :class:`~bson.son.SON`
or :class:`collections.OrderedDict` where explicit ordering is required
eg "$sort":

.. note::

    aggregate requires server version **>= 2.1.0**. The PyMongo
    :meth:`~pymongo.collection.Collection.aggregate` helper requires
    PyMongo version **>= 2.3**.

.. doctest::

  >>> from bson.son import SON
  >>> db.things.aggregate([
  ...         {"$unwind": "$tags"},
  ...         {"$group": {"_id": "$tags", "count": {"$sum": 1}}},
  ...         {"$sort": SON([("count", -1), ("_id", -1)])}
  ...     ])
  ...
  {u'ok': 1.0, u'result': [{u'count': 3, u'_id': u'cat'}, {u'count': 2, u'_id': u'dog'}, {u'count': 1, u'_id': u'mouse'}]}


As well as simple aggregations the aggregation framework provides projection
capabilities to reshape the returned data. Using projections and aggregation,
you can add computed fields, create new virtual sub-objects, and extract
sub-fields into the top-level of results.

.. seealso:: The full documentation for MongoDB's `aggregation framework
    <http://docs.mongodb.org/manual/applications/aggregation>`_

Map/Reduce
----------

Another option for aggregation is to use the map reduce framework.  Here we
will define **map** and **reduce** functions to also count he number of
occurrences for each tag in the ``tags`` array, across the entire collection.

Our **map** function just emits a single `(key, 1)` pair for each tag in
the array:

.. doctest::

  >>> from bson.code import Code
  >>> mapper = Code("""
  ...               function () {
  ...                 this.tags.forEach(function(z) {
  ...                   emit(z, 1);
  ...                 });
  ...               }
  ...               """)

The **reduce** function sums over all of the emitted values for a given key:

.. doctest::

  >>> reducer = Code("""
  ...                function (key, values) {
  ...                  var total = 0;
  ...                  for (var i = 0; i < values.length; i++) {
  ...                    total += values[i];
  ...                  }
  ...                  return total;
  ...                }
  ...                """)

.. note:: We can't just return ``values.length`` as the **reduce** function
   might be called iteratively on the results of other reduce steps.

Finally, we call :meth:`~pymongo.collection.Collection.map_reduce` and
iterate over the result collection:

.. doctest::

  >>> result = db.things.map_reduce(mapper, reducer, "myresults")
  >>> for doc in result.find():
  ...   print doc
  ...
  {u'_id': u'cat', u'value': 3.0}
  {u'_id': u'dog', u'value': 2.0}
  {u'_id': u'mouse', u'value': 1.0}

Advanced Map/Reduce
-------------------

PyMongo's API supports all of the features of MongoDB's map/reduce engine.
One interesting feature is the ability to get more detailed results when
desired, by passing `full_response=True` to
:meth:`~pymongo.collection.Collection.map_reduce`. This returns the full
response to the map/reduce command, rather than just the result collection:

.. doctest::

  >>> db.things.map_reduce(mapper, reducer, "myresults", full_response=True)
  {u'counts': {u'input': 4, u'reduce': 2, u'emit': 6, u'output': 3}, u'timeMillis': ..., u'ok': ..., u'result': u'...'}

All of the optional map/reduce parameters are also supported, simply pass them
as keyword arguments. In this example we use the `query` parameter to limit the
documents that will be mapped over:

.. doctest::

  >>> result = db.things.map_reduce(mapper, reducer, "myresults", query={"x": {"$lt": 2}})
  >>> for doc in result.find():
  ...   print doc
  ...
  {u'_id': u'cat', u'value': 1.0}
  {u'_id': u'dog', u'value': 1.0}

With MongoDB 1.8.0 or newer you can use :class:`~bson.son.SON` or
:class:`collections.OrderedDict` to specify a different database to store the
result collection:

.. doctest::

  >>> from bson.son import SON
  >>> db.things.map_reduce(mapper, reducer, out=SON([("replace", "results"), ("db", "outdb")]), full_response=True)
  {u'counts': {u'input': 4, u'reduce': 2, u'emit': 6, u'output': 3}, u'timeMillis': ..., u'ok': ..., u'result': {u'db': ..., u'collection': ...}}

.. seealso:: The full list of options for MongoDB's `map reduce engine <http://www.mongodb.org/display/DOCS/MapReduce>`_

Group
-----

The :meth:`~pymongo.collection.Collection.group` method provides some of the
same functionality as SQL's GROUP BY.  Simpler than a map reduce you need to
provide a key to group by, an initial value for the aggregation and a
reduce function.

.. note:: Doesn't work with sharded MongoDB configurations, use aggregation or
          map/reduce instead of group().

Here we are doing a simple group and count of the occurrences ``x`` values:

.. doctest::

  >>> reducer = Code("""
  ...                function(obj, prev){
  ...                  prev.count++;
  ...                }
  ...                """)
  ...
  >>> from bson.son import SON
  >>> results = db.things.group(key={"x":1}, condition={}, initial={"count": 0}, reduce=reducer)
  >>> for doc in results:
  ...   print doc
  {u'count': 1.0, u'x': 1.0}
  {u'count': 2.0, u'x': 2.0}
  {u'count': 1.0, u'x': 3.0}

.. seealso:: The full list of options for MongoDB's `group method <http://www.mongodb.org/display/DOCS/Aggregation#Aggregation-Group>`_