Search Friendly Website – Talking to Spiders

As appeared in the Presence Pointers column of the April 2008 issue of “Business Watch” magazine.

On the simplest level, a Web site visitor can be classified as human or spider. Of course human visitors are obvious and highly desirable as they are the only one of the two that can buy from us. Back in November of 2007, I talked about the importance of making sure your Web site met your human visitors’ needs in the article, It’s not about you.

But what about these spiders? Is your Web site about them, and how could it be if it is about your human visitors? Thankfully, your site is still about your human visitors, even when the spiders come crawling. While your message doesn’t need to change, we may need to change how it is delivered to get the most out of the message for both humans and spiders.

The search engine spiders, or sometimes called crawlers or bots (short for robots), spend all of their time crawling the web. Their crawling is the first step that enables a Web page to be returned for a search in any of the search engines. There is much more that goes on than just crawling – the crawling is actually the easiest thing to understand about search engines; the indexing and retrieval aspects are far more complex.

For our sake, what is important to understand is that these spiders see the Web differently. Actually, they don’t see at all, which is one of the challenges. Without question, a Web site needs to speak to its human visitors, but we also want it to be meaningful to spiders too since it may be the search engines that help deliver many of our human visitors.

Spiders are all about text. Not only are they able to consider all of the words on a page through complex processing, they are also often able to understand some of the basic meaning and overall context of a page. Because of this, they can often determine the correct meaning of a word based on the other words around it and on the page.

Certain aspects of a page carry greater importance in establishing context. The title of a page, which appears in the top “chrome” of the browser window, is the most important element to a spider in understanding what the page is about. Headings on a page (e.g., h1, h2, h3, etc. tags) carry considerable importance to and their proper usage helps create content hierarchy.

Perhaps the next important element that you can control on your site is links. Links are a little different though than titles and headings as their primary signal is about the page the link leads to, rather than the page they are on.

So there we have three absolutely critical elements to focus on. Of course you want to make sure that the rest of your site is meaningful as well. Doing so will make sure that both types of visitors find what they want, what you have to offer and keep coming back for more.

Being Spider Friendly

    • Page titles (found within the title tags) should be topically relevant to the page they are on. Ideally they should use some of the most important keyword phrases related to the page. Most importantly, every page should have a unique page title.
    • Page headings should reinforce the page titles.
    • Links should be text-based, preferably, or if image-based, contain alternative attributes. The text, or alt text for images, should be topically relevant to the destination page, rather than things like “more,” “next,” or “click here.”
    • Any important text on a page should be in html rather than an image.
    • Flash®, which I covered in last August’s article Is your Web site Flash in the Pan?, is best used for accent and user interaction, but not as the primary page content or to deliver important information since spiders have a difficult time accessing and understanding text within Flash.