Friday 15 August 2014

python - MongoDB - Distinct, Limit, and Sort for better results -


I am trying to develop a query to help in mixing the results in a search request in MongoDB. An example of my collection (and a very simple version) looks like this. Each document has a place for the query, a ranking on the quality of the listing, and the name of a provider who entered the entry. "Location": "Paris", "Ranking" "" "", "Provider": "Alpha"}, {"location": "Paris", "Ranking": "944", "Provider": "alpha"}, {"location": "Paris" "ranking": "945", "provider": "alpha"}, {"location": "Paris", "ranking" "" "" "," "", "" "," "", "" "" "", "Provider": "alpha"}, {"location": "paris", "ranking": "" "", "provider": "alpha"}, "Ranking": "", "provider": "alpha"}, {"location": "Paris", "ranking": "700", "provider": "beta"}, {"location": "Paris "," Ranking ":" 745 "," provider ":" beta "}, {" location "," "", "", "copy "Omega"}, "Omega"}, "Omega"}, "", "migrant": "omega"}, {"location": "Paris", "ranking": "885", "location": "Paris" , "Ranking": "Omega"}, "Omega"}, "Location": "London", "Ranking": "600", "Provider": {"Location": "London As you can see, providers have the most listings in alpha, and the best rankings are: "," Ranking ":" 650 "," Provider ":" Beta "}]

So when I search for Paris and sort by ranking, alpha All entries carriers are most are kept up, and turning down the beta and omega

What I've narrowed 3 each provider would do. So that although Alphas will still be on top, they will be limited to 3, which is likely to be more for Betas and Omegas. And then the remaining letters can be viewed on "Page 2" when .skip is used.

If I had to do it in Python then a synchronous example would look like this.

  #! / Usr / bin / env python # - * - coding: UTF-8 - * - results = [] providers available = colc.find ({'location': 'paris'}) Provider distinction for providers ('provider' ) Is available: search = colc.find ({'provider': provider, 'location': 'Paris'}). (3) Result = result + list (search) (result, key = lambda k ['ranking'])  

It is heavy, takes time, and overall it is just useless, Especially with the collection of 2.5 million documents. How can I do all this on the Mongo side? Thanks!

You can try some server-side JS.

  var providers = db.runCommand ({different: "colc", key: "provider"}). Value for (providers in P) {var c = db.colc.find ({"Provider": Provider [P]}). .sort ({"ranking": - 1}) boundary (3); C.forEach (printjson); }  

But as all JS mean is that this will not be the fastest option.

You can play with the aggregation framework, which is primarily like server side hits.

db.colc.aggregate ({{$ match: {"location": "paris"}}, {$ group: {_ ID: {"provider": "$ provider", "Location": "$ location"}, "ranking": {$ addToSet: "$ ranking"}}}]);

But you will need some client-side code to select the ranking of each provider from the incoming array. {"Results": ["500", "885", "670"]}, {"_id": {"provider": "omega", "location": "Paris "" "," Ranking ": [" _id ": {" provider ":" beta "," location ":" Paris "}," ranking ": [" 745 "," 700 "]}, {" _id ": {"Provider": "alpha", "location": "paris"}, "ranking": ["983", "953", "933", "945", "965", "998"]}], "OK": 1}


No comments:

Post a Comment