You can use robots.txt to remove pages from the index, including those blocked from crawl using robots.txt. This has advantages over the on page meta robots noindex method.
Server Logs After Excel Fails – BrightonSEO 2016
This post is a write-up of the talk on server log analysis I gave at BrightonSEO in April 2016.
Wayback Machine for Historical Redirect Chains
A post on scraping the wayback machine for URLs to redirect.
ICANN Drop Your Domain
If you’ve ever taken a look at more competitive SERPs, you’ve likely run into the completely bogus whois data that’s used to preserve anonymity. This is frustrating but makes sense; no-one wants to be directly linked to spam-fuelled domains, and spammers don’t want to link their domains together.
Interestingly, this opens the domains up to a serious vulnerability.
You’re breaking the rules if you provide fake whois information, and breaking these rules (even accidentally) can get your site disabled, and even make your domain name available for purchase by others.
In this blog post we’re going to report one of my own domains to ICANN (Internet Corporation for Assigned Names and Numbers). Here’s the ICANN procedure in brief:
For our test, I’ve registered fakewhois.xyz (you can register a site on this questionable tld for $1 on namecheap, with free whoisguard):
We aren’t using the free whoisguard. Post registration, the whois records are updated with my registrar to the following:
Mr. Fake 123 Fake Street Springfield; NA W8 1BF GB tel: +44.1234567891 fax: +1.1234567891 [email protected]
The majority of fake whois data is set up with completely fake accounts, which do not forward to a monitored account. As a result they are vulnerable to this method.
The email used will forward to my own, but it will be ignored. We want to see how the registrar acts, and if they raise the issue of the reported fake data at the registrar account level (rather than simply contacting [email protected]).
Domains under WHOIS protection still reveal their information to ICANN, just not the general public. Still, since my plan is to anonymously snitch on myself here, I make this information public:
Once this fake information is public and verified with external whois services, we report the site to ICANN:
Date of submission: 24.11.2015
One week after submission, ICANN respond with the following:
Thank you for submitting a Whois inaccuracy complaint concerning the domain name http://fakewhois.xyz. Your report has been entered into ICANN's database. For reference your ticket ID is: OUM-161-68604. A 1st Notice will be sent to the registrar, and the registrar will have 15 business days to respond. For more information about ICANN's process and approach, please visit http://www.icann.org/en/resources/compliance/approach-processes . Sincerely, ICANN Contractual Compliance
Bulk Inspect http Response Headers
There are plenty of SEO reasons you might want to look at http headers. Google love offering them as an alternative implementation for a number of directives, including:
- Vary: User-Agent
- Canonical
- Hreflang Implementation
- X-Robots (noindex, nofollow)
Link: <http://es.example.com/>; rel="alternate"; hreflang="es"
Link: <http://www.example.com/>; rel="canonical"
X-Robots-Tag: googlebot: nofollow
Vary: User-Agent
If anyone’s doing anything a little sneaky, you can sometimes spot it in the file headers.
There are a number of tools that let you inspect single headers, including your browser (press F12 and poke about to get something like the following).
Installing Applications to SD Card in Windows 10
A few months ago I bought cheap a tablet running Android and Windows 10 (no regrets so far). With this came the desire to run full versions of Windows specific applications portably, using either a tethered connection or readily accessible WiFi.
This is impractical, given that tablets have limited on-board storage. We may only have 8GB to work with. But SD cards are cheap ( I’ve seen branded 128GB micro SD cards for £35 at the moment).
The release of Windows 10 included a disabled ‘install to SD card‘ feature pegged for a future release, so I was unable write this post until then. The ‘Threshold 2‘ or ‘November‘ update re-enabled this feature.