Link Analysis: Filter Non-Indexed Domains

One thing I’ve found incredibly effective for at scale link analysis is filtering for links from domains/subdomains that are not indexed. To check whether a domain is indexed, you should run an info:URL query on it. If no result shows for the URL, or a different URL shows, then the URL you are checking does not appear in the index. If this is true for the homepage of the domain, then it’s likely that something is going on.

You can use the index check function from Scrapebox to perform this check at scale: read more

How Accurate Are Site:URL Numbers?

Many SEOs don’t really trust the Site:URL command. Most SEOs also don’t trust the “About X results” numbers that appear when you make a Google search. I didn’t either, and had always thought that they must be  pulled out of the air, and pretty much useless.

Have you ever scrolled to the end of the results to see how closely the numbers match up? For example:

We know that Moz has more than 700 pages in the index (anyone from Moz, please comment your business critical WMT numbers below), and that the number is probably some way closer to 226,000 than it is to 700. read more

Know When a Canonical is Obeyed

If you want a quick and dirty way to see whether an implemented canonical is being obeyed by Google you can make use of the info:URL query. This query is typically used to check whether a URL is indexed (as in Scrapebox), but can also be used to check whether or not the canonical element of the URL entered appears as a result. If it does, then the canonical is being obeyed.

For example, Clarins UK have some lovely parameters in use that they don’t want indexed or working against them. They’ve used the canonical tag on these pages to direct Google to the parameter free URL. If we search info:,en_GB,sc.html?prefn1=collection&prefv1=Multi-Active in Google, we receive this result: read more

Diminishing Returns with rel=”author”

If links from domains, or different IPs are given their weight because they are seen as an editorial vote from a single entity (an individual), then tying together linkbuilding efforts to an authorship profile may not be the best thing to do. If we’re following Mr Cutts’ line that  some forms of what we’d call “attribution linking” aren’t strictly editorial, then there might be reason for concern. To use authorship markup at all you have to be drinking at least a little from Google’s trough. read more

Identify Which URLs in Your Sitemap Aren’t Indexed

This shouldn’t be your first port of call for looking at indexation issues. Ideally, the site should have logically separated sitemaps to make any indexation problems easier to spot (e.g. is it product pages that are suffering, or the subcategory pages?). The URLs in the sitemap should all return a 200 result, and (nearly always) include only pages you want indexed (nothing blocked in robots, or with restrictive meta directives etc). If we have something more esoteric causing the issues, then this post might help. In webmaster tools we get this familiar report – read more