Whether the Google Disavow Links tool is a good or a bad thing, it doesn’t matter. There will be a enough published recoveries to get penalized webmasters salivating. Tin Foil Hat Time…
Machine Learning
reCaptcha, as I understand it, is an exercise in machine learning. The machine uses image recognition to assign a confidence value to some image (in this case text). reCaptcha then uses humans to improve the algorithm, by having them enter the images of text as they read them. Typically this will be the same, or similar to, what the machine thinks. This data is then fed back into the OCR program.
I think spam flags might work in a similar way. The Disavow links tool is an excellent way to collect data which would allow Google to improve the algorithmic detection of spam – the webmasters improving the machine confidence that certain types of links are less-than-kosher. Some webmasters are probably thinking “damn, I’m not dumb enough to fall for that one!”, and they’re right ; they aren’t going to fall for it. But they might fall prey to it.
Imagine there are 3 types of spammier linkbuilding tactics being employed by three different webmasters. Of the three tactics, there is a small, medium and large risk associated. Webmaster A happens to be using the low risk method. She knows the other webmasters have been penalized, but as she has not suffered a penalty, sees no reason to disavow her links. Webmasters B and C have been using all three tactics, and have been algorithmically penalized. Not knowing how to recover (“I’ve got nothing to lose but my sweet, sweet adsense revenue!”), they submit links from all three methods – increasing the future risk factor associated for all of them.
- Not all spam will get you penalized.
- If you are penalized and don’t know what form of spam caused it, you’ll have ideas of what could have.
- There is a cautious way to approach this tool when penalized, but enough people will use a shotgun approach that you might as well too.
The effect of submission apparently isn’t overnight. I don’t see a reason why Google couldn’t crawl those links pretty much right away – but not doing so has a helpful consequence. The associated wait will cause webmasters to use a “more is more” approach to submission. Agencies will get itchy. Why risk only submitting the stuff you’re certain Google can tell is bad, when it might not get you recovered? “It’s safest to confess”, unless you want to wait a few weeks/undisclosed to see if you’re still on the naughty list. If you aren’t under penalty but have a high proportion of questionable links to your domain then it’s reasonable to be worried right now. If you can afford to do so, have your spam categorized and when it comes to submission time, don’t submit it all at once – some of it just might be helping you rank (for now). The likelihood is that the proportion of your links that are helping you to rank now is going to get smaller.
Submissions at Scale (For Whatever Reason)
Firstly, what Google knows and what Google tells you it knows is different. It would be nice if they gave you every link they’ve crawled in the wild, but they won’t. Telling Google you know more than this (by submitting from other indexes besides Webmaster Tools) may not be the best idea since it indicates you might be in the business of SERP manipulation (another spam flag). But this doesn’t matter, because other people will. Just consider the scale of submissions that will be coming from agencies with penalized clients. Nice human flagged examples of manipulative link practices. Think of all the smart people testing the tool (using Xrumer/whatever for temporary rankings to see if they can disavow before a penalty is incurred). The majority of testing will be on links that are clearly, and obviously risky. But for the minority; Dwarves always dig too deep. Result : Algo gets smarter. This isn’t strictly a prisoners dilemma (or a bad thing), but it certainly feels like one.
I didn’t do any reading on this. Sorry if this is just added more noise.
That’s what I said when I first heard about it: https://twitter.com/SemMetric/status/258308110039523329
Of course the effect won’t be overnight, it would make it way too easy to reverse engineer. Google may still crawl them earlier, but the effect will be delayed.
Nice round up of the disavowel tool. Heard a few people get one keyword penalised, afraid to submit recon for that because then Google might realise that their whole backlink profile is span! So disavow could go the same way!