Getting all your pages in the Google index
If you have followed my blog lately, you will know that I had some problems with entries of my blog being dropped from the Google Index. This was several days back and I assumed it had to do with a lot of 404ers Googlebot was getting. I tried to fix this with re-submiting the Sitemap and requesting old content to be removed. This took some while, but eventually today it seemed to work. Well, at least partly.
The first removal request with the Google Webmaster Tools took around 2 days. The second request only 10 hours or so and my third request was processed within a couple of hours! I really don’t know why it took longer in the beginning and was super fast at the end, but now the number of pages on tobman.com listed in the Google index dropped from around 120 to 15! Great news! You can check this yourself using this search request:
site:www.tobman.com
Of course you should try to run this query with your domain as well. The goal is just to have relevant content being indexed. While it worked out to get rid of outdated content I still have the problem that not all of my sites are being indexed. It’s much better now than a couple of days back, but currently out of my 22 posts only 12 are indexed. At the Webmaster Tools it even says that 0 out of the 23 links submitted with the sitemap are indexed. It seems that it takes some time for this report to update (although I think I have never seen anything else than 0 so far).
I sat down before and tried to figure out what the reasons for not being indexed could be. I honestly could not find any regularity. I had a look at the number of onsite and offsite links and whether pictures are included in the post or not - this doesn’t seem to affect the indexing behaviour. The only thing that might the case that posts not indexed seem to be a little shorter than others. Maybe Google thinks their content is too short and therefore not worth being listed? I don’t know… I will try to relink to some of the “lost” posts, just as to my great post on why content is the key and see whether this changes anything. If not I will try to lengthen those posts as an alternative approach.
Just as a sidenote: this has been the fourth blog entry on search engine optimization in a row. I hope you find it useful… If not, don’t worry I have some new ideas for cakePHP posts and will try to write them as soon as possible!