Search Console Crawl Stats URL Parameters

One of the lovely parts of Google Search Console is the Settings > Crawl Stats report suite:

And something I only noticed today is the unexpectedly friendly URL structure:

/search-console/settings/crawl-stats/drilldown?resource_id=sc-domain%3Aohgm.co.uk&response=1

If you haven’t had enough coffee today, this probably means the following will do something:

/search-console/settings/crawl-stats/drilldown?resource_id=sc-domain%3Aohgm.co.uk&response=2

and it does.

(I don’t usually see the parameters at the end of the URL because I have too many pinned extensions in Chrome).

This post is just me finding something interesting, rather than uncovering this THING they don’t want you to know.

And the documentation is really good here: https://support.google.com/webmasters/answer/9679690?hl=en

So it’s not like I’m discovering much here. Enjoy.

&response=

I am just going to run through and list the responses for each parameter.

OK (200)
Not found (404)
Unauthorised (401/407)
Other client error (4xx)
DNS error
Fetch Error
Blocked by robots.txt
This returns a 400 for some reason.
Server error (5xx)
DNS unresponsive
robots.txt not available
Page could not be reached
Page timeout
This returns a 400 for some reason.
Redirect error
Other fetch error
Moved permanently (301)
Moved temporarily (302)
Moved (other)
Not modified (304)

&file_type=

HTML
Image
Video
JavaScript
CSS
PDF
Other file type
This returns a 400 for some reason.
Unknown (failed requests)
Other XML
JSON
Syndication
Audio
Geographic data

&purpose=

Discovery
Refresh
Other fetch purpose

( ͡° ͜ʖ ͡°)

&googlebot_type=

Smartphone
Desktop
Image
Video
Page resource load
Other agent type
Adsbot
Storebot
AMP

Anything interesting?

Yes, please have a look yourself. I will dig in more once I have the will.

Why are a few of these options returning a 400?
I’ve never seen many of these in the wild – are they all functional or placeholders for planned/deprecated/currently broken reports?

For example, I don’t think I’ve ever seen Blocked by robots.txt (&response=7) show up in this report before. It does show up in Coverage with ‘last crawl’ dates (I had a look at this back in 2018 to see whether blocked crawl ‘attempts’ were recorded):

Search Console Crawl Stats URL Parameters 1

But this particular crawl stats report doesn’t seem to be populated for any sites I currently have access to:

Search Console Crawl Stats URL Parameters 2 — Attentive readers may note the date discrepancy. It’s not an issue, the coverage report goes back to late April with examples.

And you would expect it to be if we’re getting the coverage report showing recently ‘crawled’ URLs.

The documentation indicates it probably is intended to function:

View the hyperlink for and jump to the possibly good response codes section (this is a screenshot of that text) — Documentation indicates it probably should function.

Maybe it works for you, let me know! It might be temporarily broken (and I’ve previously been inattentive).

I hope some tool providers can use these URL formats for more efficient scraping.

This post could be a tweet thread, but if I did that, I’d never find it ever again.