Friday, 15 April 2011

unix - Finding out how when google last crawled -


I have to find out how much a copy of Google's cache of a wide set of current pages is,

  • Check to find the user-agent "googlebot", then
  • Export a list which is available for each Page and when it was last visited
  • I think this may be a cron job which runs weekly. If this is correct, how do I write the script? If this is wrong then what better way would be?

    Google already provides this information. I've used it for the past three years - Works great -

    Add your site to sitemap and create a generic Sitemap XML for your site on your web server (Google for the website that offers free Is), then relax Google, there is a section in the Sitemap named Crawl Stats what you want.

    Get Google's view of your site and diagnose problems

    Check that Google crawls and indexes your site And we can learn about the specific problems that reach them.

    Search your links and query traffic

    View, classify and compile comprehensive data about your site's internal and external links with new link reporting tools download. Find out how Google search queries drive traffic to your site, and see when users will arrive there.

    Share information about your site

    Tell us about your pages with Sitemaps: Which are the most important to you and how much You can also tell us what kind of URLs we indicate.


    No comments:

    Post a Comment