Archive

Posts Tagged ‘error page’

How Googlebot reacts to broken links (404 errors)

November 17th, 2008

Well, sometimes you have to learn it the hard way. Initially three days back I was really amazed by the huge increase of traffic I was getting after I had started blogging. I wrote new blog entries and new content and thought there should only be one way for the traffic to go: increase even more. Wrong! Yesterday and today have been far worst than the two days before, which I know thanks to Google Analytics and WordPress.com Stats. So kept myself asking what is going wrong here… And I just found out the answer: I literally pissed off Googlebot with broken links and as a consequece a whole bunch of webpages from my blog was dismissed from the search index.

All this happened because I wanted to optimize my blog for search engines and changed the URL format. Also I was playing around with linking to the blog entry id’s instead of the permalinks… bad idea! If you omit the domain part (absolute path) and work with relative paths the links will work fine from your mainsite of the blog, but won’t once you try to click on one of them from within a blog entry. So I recommend just linking to permalinks and don’t - I mean really NEVER - change your URL structure once it has been set up! My post about installing cakePHP on Mac became quite popular (for my standards of course…) and I’m still losing a lot of traffic because the initial URL all those sites are linking to is gone.

So what had happened? Googlebot started to crawl my site as usual but soon stumbled across a couple of 404 errors (page not found). It seems that at some point this just got too much (17 errors) and Google stopped crawling the site and several pages that still existed were dropped from the index (I checked by querying search terms, where I usually came up pretty high and also used the site: query)! I found this out using the great Google Webmaster Tools which showed me exactly where errors occured. After fixing all the broken links I also used those tools to - hopefully - repair this damage as quick as possible: I used the URL removal tool to mark several directories as outdated and also re-submitted my Sitemap with the working URLs. As I already posted a couple of days ago, Googlebot is fast, so I hope this will be fixed soon.

But as two major learnings from this I will take away: get your links right, have ZERO 404 errors on your blog & get your permalink structure right the first time and NEVER change it! I hope this is useful for some of you (I make the mistakes so you don’t have too…) and if you have other tips on this topic please share by commeting below!

Project WebMoney, Search Engine Optimization , , , ,