Jeg formoder, at problemet ikke var, at sammenlægningen ville bryde et indeks, men i stedet at sammenlægningen ikke brugte indekser og ville udføre en indsamlingsscanning.
Aggregeringer kan drage fordel af indekser, når der er $ kamp- og/eller $sort-stadier
placeres i begyndelsen af en rørledning. Denne aggregering er kun en enkelt $group
fase, hvilket betyder, at hele samlingen skal gentages for at beregne antallet.
Jeg sætter et simpelt eksempel nedenfor, der viser aggregeringen, der udfører en samlingsscanning, selv når arrayfeltet er indekseret.
> db.foo.insert({ "x" : [ 1, 2 ] } )
> db.foo.insert({ "x" : [ 1 ] } )
> db.foo.createIndex({ "x" : 1 } )
...
> db.foo.aggregate([ { $group: { _id: null, cnt: { $sum : { $size: "$x" } } } } ] )
{ "_id" : null, "cnt" : 3 }
// Results of a .explain() - see 'winningPlan' below
> db.foo.explain(true).aggregate([ { $group: { _id: null, cnt: { $sum : { $size: "$x" } } } } ] )
{
"stages" : [
{
"$cursor" : {
"query" : {
},
"fields" : {
"x" : 1,
"_id" : 0
},
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "stack.foo",
"indexFilterSet" : false,
"parsedQuery" : {
},
"winningPlan" : {
"stage" : "COLLSCAN",
"direction" : "forward"
},
"rejectedPlans" : [ ]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 2,
"executionTimeMillis" : 0,
"totalKeysExamined" : 0,
"totalDocsExamined" : 2,
"executionStages" : {
"stage" : "COLLSCAN",
"nReturned" : 2,
"executionTimeMillisEstimate" : 0,
"works" : 4,
"advanced" : 2,
"needTime" : 1,
"needYield" : 0,
"saveState" : 1,
"restoreState" : 1,
"isEOF" : 1,
"invalidates" : 0,
"direction" : "forward",
"docsExamined" : 2
},
"allPlansExecution" : [ ]
}
}
},
{
"$group" : {
"_id" : {
"$const" : null
},
"cnt" : {
"$sum" : {
"$size" : [
"$x"
]
}
}
}
}
],
"ok" : 1,
...
}