Straight From The Horses Mouth – Get Googlized

Straight From The Horses Mouth – Get Googlized

Date: 2011.02.24 | Category: Search Engine Optimization | Tags:

Many webmasters wonder how to ensure their sites will be included in Google’s index of web sites. Although Google crawls more than a billion pages, it’s inevitable some sites will be missed. When Google does miss a site, it’s frequently for one of these reasons:

* The site is not well connected through multiple links to others on the web.
* The site launched after Google’s last crawl was completed.
* The design of the site makes it difficult for Google to effectively crawl its content (excessive frames, tables, etc).

Google’s intent is to represent the content of the Internet fairly and accurately. To help make that goal a reality, we offer this guide to building a “crawler-friendly” site. There are no guarantees a site will be found by our crawler, but following these guidelines should increase the probability that your site will show up in Google search results.

Do…
Provide high-quality content on your page – especially your home page.
If you follow only one tip from this page, this should be it. Our crawler indexes web pages by analyzing the content of the pages themselves. Google will index your site better if your pages contain useful information. Plus, your site has a better chance of becoming a favorite among web surfers and being linked to by others if the information it contains is relevant and useful.

Submit your site to the appropriate category in a web directory.
Listing your site in the Open Directory Project http://www.dmoz.org/ or Yahoo! http://www.yahoo.com/ increases the likelihood it will be seen by robot crawlers and web surfers.

Pay attention to HTML conventions.

Make sure that your <TITLE> and <ALT> tags are accurate and descriptive. Also, check your <A HREF> tags for errors since broken or improperly formatted links can prevent Google from indexing your page.

Make use of the robots.txt file on your web server.
This file tells crawlers which directories can or cannot be crawled. Make sure it’s current for your site so that you don’t accidentally block our crawler. Visit: http://www.robotstxt.org/wc/faq.html for a FAQ answering questions regarding robots and how to control them once they visit your site.

Ensure that your site is accessible through HTML hyperlinks.
Generally, your site is crawlable if the pages are connected to each other with ordinary HTML links. If certain areas are not linked, you may be excluding older browsers, differently-abled users, and Google. Google can crawl content from a database or other dynamically generated content as long as it can be found by following links. If you have many unlinked pages, you may want to create a jump page from which the crawler can find all of your pages.

Build your site with a logical link structure.
A hierarchical link structure is not only beneficial to you, but also to Google. More of your site can be crawled if it is laid out in with a clear architecture.

Don’t…
Fill your page with lists of keywords, attempt to “cloak” pages, or put up “crawler only” pages.
If your site contains pages, links or text that you do not intend visitors to see, Google considers them deceptive and may ignore your site.

Feel obligated to purchase a search optimization service.
Some companies “guarantee” your site a place near the top of a results page. While legitimate consulting firms can improve your site’s flow and content, others employ deceptive tactics to try and fool search engines. Be careful – if your domain is affiliated with one of these services, it could be permanently banned from our index, we have found search engine optimization software like Web Position Gold works best but, again use it in moderation.

Use images to display important names, content or links.
Our crawler does not recognize text contained in graphics.
Use ALT tags if the main content and key words on your page cannot be formatted in regular HTML.

Provide multiple copies of a page under different URLs
Many sites offer text-only or printer-friendly versions of pages that contain the same content as the graphic-enriched version of the page. While Google crawls these pages, duplicates are removed from our index. In order to ensure that we have the desired version of your page, place the other versions in separate directories and use the robots.txt file to block our crawler.

Article written by a Google employee

« Source Code – Eliminating HTML Margins | Marketing Campaigns – Testing Your Variables »

Straight From The Horses Mouth – Get Googlized

Leave a Reply

Related Posts

Recent Posts