9 May 2007
[Google] Blog Search indexes blogs by their site feeds
So when I recently started to receive Google Blog Search alerts for Digg articles that didn’t contain any of my search terms, I figured I’d found a bug, posted about it in the forum and emailed the Google Alert and Google Blog Search teams. After receiving more of these alerts on a daily basis, and having heard nothing from Google, I decided to investigate if there was a reason for this.
The search phrase in question was my name in double quotes, so it was quite obvious to me that this phrase didn’t appear in any of the feed items being returned. After carrying out a few searches, I discovered a blog* that was automatically aggregating RSS feeds from Digg and Google Blogoscoped (amongst others). However, due to a bug in their code, my name was occasionally being used as the link text for various Digg articles whenever I was incorrectly identified as the post’s author.
Since this was the only way my name could have been linked to the Digg articles being returned in the search results, it’s pretty obvious that Google Blog Search doesn’t just index blogs based on their feeds; it also quite clearly uses link text from other feeds (or perhaps just other web pages) probably in a similar way to their web search.
Although this produced irrelevant results for me in this instance, the theory is good. Presuming link text is relevant, this should improve Google Blog Search results. However, I think this raises some interesting questions:
- Is Google Blog Search just using link text from indexed feeds or is it being integrated with Google’s standard web index?
- Would it be possible to create a Googlebomb for Google Blog Search using this knowledge?
- Does PageRank count when it comes to blogs and feeds?
- Is Googlebot crawling links found only in feeds and does it obey the
- Will Google Blog Search try to ignore paid links?
What does this mean for SEO? Will webmasters now start to focus their efforts on getting inbound links from third-party feeds?
* Rather than being one of those annoying blogs that “borrows” RSS content in order to generate revenue, the owner of this blog states, “i am not scraping content for money reasons, just because i am lazy to visit the blogs every day.”