I have a large threaded feed retrieval script in Python.
My question is, how can I load the balance outgoing is the request because I do not hit any one host many times?
This is a big problem for FeedBurner, because a large percentage of sites will be neglected by a subdomain aka on their domain to feed their RSS ahead of FeedBurner and many other things, To make them vague that they are using it (like "MySite" feeds sets their RSS URL on mysite.com/mysite, where feeds.mysite.com bounces for FeedBurner) Give a little If the charges are blocked and redirect them.
You should probably request once (per week / month, whichever fits). Follow the redirects for each feed and to get a "true" address. Regardless of the position of your throttling on time, you should be able to solve all the feeds, save that data and then add it to the list once for each new feed. You can look at this as you put it in the last URL of the URL. When you ping the feed, make sure the user redirects properly, if the user transfers it or the same, make sure that the original (keep "real" for load balancing).
You can develop only one load mechanism, such as only X requests per hour for a given domain, going through each feed and leaving feeds whose hosts hit the border If FeedBurner keeps its limit public (not likely) you can use it for X, but otherwise you will be guessed and guessed to guess that you Can be reduced by obtaining the
Edit : Added suggestions comment.
No comments:
Post a Comment