
session 30 - SEARCH ENGINES
Search Engine Spiders take information to their database or index, which contains a copy of every single page that the spider brings back. When a spider returns to your website and brings back your website changes to the catalog, Check the search engine to be sure that you are indexed. Only Alta Vista "Instant Indexes", which means that the information is put into the database of websites within days. Many cannot crawl nor index frame pages or image maps All robots will read your robot.txt page. Have as many well regarded websites linked to yours, known as Link Popularity because all search engines can determine how many links are going to and from your page. Some decide to "index or not" based on this.All of the major search engines' databases will index the full body text. There may be certain words missing that are called "stop" words. These stop words are words that many spiders skip in order to move faster.
Chose descriptive keywords to reflect the content. Have a really good TITLE 6 to 12 WORDS {60 characters max } and Meta Description 20 to 25 words {150 characters max} and use the word "FREE" at the start if practical. That entices people to click on your link when it comes up in a search,
When the spiders get to a website, they index most of the words on the pages. In ranking web pages, search engines look for the location and frequency of keywords and phrases on the web page document and, often in the META tags. They check the title and the headers and TEXT { about 150 to 200 words} near the top of the page. Include your keyword phrases in the text.
Subject directories are created by people, not spiders or robots. There are general directories, academic directories, commercial directories, and portals that have been created by commercial interests to offer additional services such as email, current news, stock quotes etc. Examples are Yahoo! Excite, Snap and Magellan
Library gateways are collections of reviewed links that have been created by specialists to support research needs and to point to specialized databases created by professors, researchers, governmental agencies, business interests, or other subject specialists. Some examples are: Librarians' Index to the Internet, Academic Information, AlphaSearch, Digital Librarian, Internet Public Library and WWW Virtual Library. Some examples of specialty search engines are: Lycos Directory, Search.com and WebData.
Field searching on the web allows you to specify where you want the search engine to look in the web document. Some search engines allow you to retrieve info by using the correct field label in combination with your search term(s). TITLE SEARCHING The title appears in the banner at the top of your browser's window. EXAMPLE: title:"search engine tutorial" returns pages that have these words in the title. DOMAIN SEARCHING - EXAMPLE: domain:edu and "Theory of Creation" limits your search to educational sites dealing with the Theory of Creation" SearchEdu, does it for you by limiting all its searches to the .edu domain. For an International domain, search the domain geographically using the two-letter country code. EXAMPLE: domain:UK and "Shakespeare" limits your search to sites in the United Kingdom dealing with Shakespeare Other countries having their own two letter codes are: CA for Canada; FR for France, etc. For a list of ISO Internet Country Codes, go to: HotBot's List of Country Codes URL SEARCHING- EXAMPLE: url:goodnews returns sites in which the filename, goodnews is incorporated into the URL. IMAGE SEARCHING - EXAMPLE: IMAGE:computer.gif Other searchable fields include anchor, applet, object, text, language, sound and pictures. NOTE: In all field searching do not leave any spaces between the field term, the colon, and the first keyword
The "KEYWORDS" <Meta> tag allows you to specify a load of words which are related to your page. When someone enters these words into a suitable search engine your page will have more chance of getting displayed. Spiders only read these tags to a certain point (perhaps the first 50 words at most) Don't use hidden keywords in any form. The keywords can include up to 874 characters of text.
Searching the Net for a Directory or Web Site for your product or service - Type in your product or service into any search engine and see who comes out on top. Go to the site and see if it is a directory or site that lists your product or service, or related directly to your product or service.
There is a new ranking system that is now being used by Alta Vista, Excite, Google, Lycos and the search portion of Yahoo. This new system will track and rank sites according to the number of links pointing to a particular web site. The quality of links is considered as well. With these new changes now in effect, not only are web sites ranked by keyword relevancy, meta tags, title and text, but the overall popularity of the web site as well. By obtaining just a few links from high-traffic web sites your traffic will increase considerably.
What search engines are most frequently searched ?MSN 35.6%
Yahoo! 23.7%
Google 14.1%
Source: MyComputer.com
According to a Forrester Research Media Field Study, getting a loyal audience in the first place is best done by Search Engine Placement. According to a GVU Users Survey, 84.8% of Internet users use Search Engines to find Websites. In a study released by ActivMedia Research in September 1999, Search Engine Positioning was ranked as the #1 Website promotional method used by eCommerce sites..... Target Marketing Magazine.
"Top Ways Websites are Discovered"
Banner ads: 1%
Targeted email: 1.2%
TV spots: 1.4%
"By accident": 2.1%
Magazine ads: 4.4%
Word-of-mouth: 20%
Random Surfing: 20%
Search Engines: 46%
Very rarely will anyone look beyond the first 30 results returned from a search. This makes perfect sense because the most relevant sites are always listed at the top. So if your prospect doesn't find what they want within the first 20 to 30 listings, they'll simply do a new search.
If your site falls anywhere below the 30th listing, you don't stand a chance against anyone in the TOP-20. Hence, it should be your goal to achieve Top 20 positions.
How do you get your Website listed in the top 20?
1) You can attempt to gain these Top 20 rankings yourself. However, this can easily become a full time job. (I think this is why so many marketers advise against focusing on search engine positioning.)
2) You can hire a reputable company who can achieve AND maintain your Top 20 rankings for you (be sure they guarantee their service and have several verifiable clients that currently have multiple Top 20 rankings).
Submit to HotBot.com (an Inktomi partner) to get your site indexed by Inktomi. Besides HotBot, Japan's Goo.ne.jp engine based on Inktomi will also index your page. The first box on the page is for your URL and the second box is for your e-mail address: .... http://www.goo.ne.jp/help/info/url.htmlAlternatively, you can get your pages indexed within 48 hours and updated multiple times a week with Inktomi's paid submission
Goto.com, in its persistent quest to display its paid listings just about everywhere, cut a deal with Iwon.com. Iwon now serves up an almost mirror image of Goto.com. This is a big win for Goto.com but a loss for Inktomi. Inktomi results are still displayed on Iwon, but only after all the Goto.com paid listings are displayed. What's particularly disturbing about Iwon's implementation is that unlike most other sites that mark Goto results as "sponsored listings," Iwon gives no clear indication to the user that the results are paid listings. Anyone who still thinks they're getting an unbiased look at the Web from Iwon.com better think again.
AOL still appears to search the full-text Inktomi engine. However, as in the past, they appear to only display sites that appear in both Inktomi and in Open Directory
We design your website so that each webpage includes a clear page title since when your site shows up on a search engine, the webpage title will be displayed. The title of each page ought to be both descriptive as well as provocative. For example, "Widget Specifications" might be better titled "Acme Widgets Have Won Five International Awards for Quality." Think of the webpage title as vital marketing tool.The description META tag is used by some search engines as the sentence or two they display below your webpage title. We limit this to about 200 characters or so. The keywords META tag will include half dozen to several dozen words that someone might search on in order to find your website.
There's an important distinction between search engines and directories. A search engine is an automated indexing system that periodically sends a robotic "spider" to your website to "crawl" (that is, scan and index) some or all of your webpages. A directory, on the other hand, is a listing describing your website edited by humans. In order to get on the search engines' radar you need to register your webpages. Here are some of the most important North American search engines with which to register:
AOL Search, AltaVista, Excite, FAST Search, Google, HotBot, Inktomi, Lycos, MSN Search, NBCi, Netscape Search, and Northern Light. Each of these has a URL where you can suggest your own webpage, but nearly all the search engines can be submitted to at
Directories - Search engines are vital, but directories drive as much or more traffic to your site. And increasingly, search engines are combined with directories, so they'll include information from both human edited directories as well as automatically indexed webpages. The most important directory by far in most countries of the world is Yahoo! (www.yahoo.com). But they are very picky about what they'll include in a listing and reject many listings because of shoddy websites. You'll also want a listing in the DMOZ Open Directory Project (www.dmoz.org) since its material are also used by HotBot, Lycos, and Netscape Search. A third important directory s LookSmart (www.looksmart.com) -- not because it gets a lot of hits itself, but because its listings feed into search results from MSN Search, Excite, AltaVista, iWon, and CCN.com. The other directory that you should submit to is Ask Jeeves! (www.ask.com).
Paying for Directory Inclusion - In the past year commercial sites have had to pay $199 for a listing to be considered for Yahoo! and another $199 for Express Submit at LookSmart. We recommend that you pay for a Yahoo! listing first, and then, if you can afford it, a LookSmart listing.
Sometimes you have pages on your website that you don't want the search engines to see - maybe they're not optimized yet, or maybe they're not quite relevant to your site's theme. In other cases, you want to get rid of some annoying search robot that's cluttering up your logs. Whatever your reason is for wanting to keep the spiders under control, the best way to do so, by far, is to use a "robots.txt" file on your website. Robots.txt is a simple text file that you upload to the root directory of your website. Spiders read this file first, and process it, before they crawl your site. The simplest robots.txt file possible is this:User-agent: *
Disallow:
That's it! The first line identifies the user agent - an asterisk means that the following lines apply to all agents. The blank after the "Disallow" means that nothing is off limits. This robots.txt file doesn't do anything - it allows all user agents to see everything on the site.
Now, let's make it a little more complex - this time, we want to keep all spiders out of our /faq directory:
User-agent: *
Disallow: /faq/
See how simple it is? The trailing slash is necessary to indicate that this is a directory. We can also add directories:
User-agent: *
Disallow: /faq/
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /info/about/
That was easy, but what if we want to disallow access to only one file? It's simple:
User-agent: *
Disallow: about.html
Disallow: /faq/faqs.html
Now let's get specific. So far, we've created rules that apply to all spiders, but what about an individual spider? Just use its name.
User-agent: Googlebot
Disallow: /faq/
Now, let's combine individual spider control with a catch-all:
User-agent: Googlebot
Disallow: /
User-agent: *
Disallow: /faq/ FREE DOWNLOAD Click Here <http://clicks.ientrymail.com/servlet/Clickthru?q=a5-Dy_HQ7ZSGfwyjG1hzMAb2Iq7> - Get the most comprehensive analysis of how your web visitors interact with your site.
This set of commands tells Googlebot to take a hike - the slash character ("/") by itself means that the entire site is disallowed. For all other user-agents, we've just kept them out of the /faq directory.
Each record in a robots.txt file consists of a user-agent line, followed by one or more Disallow directives.. The blank line between the two user-agent records is necessary for the file to be processed properly.
If you'd like to add comments, use the "#" character like so:
# keep spiders out of the FAQ directory
User-agent: *
Disallow: /faq/
You can use any text editor that saves text in a web-friendly format. I like Notepad or Unixedit, both of which are free.
If you don't feel like using a text editor, or just don't want to deal with creating your own robots.txt by hand, click here <http://www.rietta.com/robogen/> for a program that will help you.
There's a nice robots.txt validator here <http://www.tardis.ed.ac.uk/%7Esxw/robots/check/> - use this site after you've uploaded your robots..txt file to make sure that it will really work.
The following is a listing of the four major search engine spiders and their associated user-agents.
# Altavista (Altavista search engine only)
User-agent: Scooter
# FAST/AllTheWeb (AllTheWeb search engine)
User-agent: fast
# Google (Google Search Engine)
User-agent: Googlebot
# Inktomi (Anzwers, AOL, Canada.com, Hotbot, MSN, etc.)
User-agent: slurp
My policy is to only exclude search engine spiders from pages that may contain words that aren't relevant to my site's theme. More often, we use robots.txt to keep away all of the annoying little spiders that aren't from the search engines, but that's a story for another day!
About the author
Dan Thies has been helping his clients (and friends) promote their websites since 1996. His latest book, "Search Engine Optimization Fast Start" <http://www.cannedbooks.com/> , offers a simple, step by step plan to increase your website's search engine traffic.
SelfPromotion.com is the net's leading resource for do-it-yourself Web Promotion. Here you will find all the information and automatic submission tools you need to do the job quickly, efficiently, and most of all, properly! If you invest a little time into reading and using this resource, you'll not only do a much better job of promoting your site, but save yourself a lot of time and effort in the process.
Get your site working properly. It doesn't matter how many people you attract to your website if, once they get there, they immediately get turned off by an unattractive presentation or a half-built website. Choose keywords and tweak your site for the search engines. Once your site looks good to humans, the next step is to try to make it look good to the search engines, so you get the coveted high ranking. This involves choosing the right keywords and adjusting your page title, meta tags and first paragraph to showcase them. This is where most webmasters mess up. They choose the wrong keywords because they don't spend enough time thinking about how people are going to try to find them. Submit to the major search engines. Now that your site is all ready, you next submit to all the major search engines. One of the key components of SelfPromotion.com is an extremely powerful and comprehensive automated submission tool that can properly promote your site to all the major search engines and indexes. The good news is that it'll only take you about 10 minutes to create an account and promote to the search engines. The bad news is that at present, the major search engines are taking months (that's months, plural!) to add new listings. The most important search engine, Google, updates on a 4 week cycle, and it usually takes 2-3 cycles to get in Submit to the major indexes. While SelfPromotion.com will submit your site to a ton of places, it does not autosubmit to the major indexes such as Yahoo and Open Directory. The reason is that listings in these indexes are sufficiently valuable that a hand-done, optimized submission is worth taking the time to craft.
Yahoo is the most important place to have your site listed on the Internet, yet most Yahoo listings are awful. Once you understand how to craft a proper submission to Yahoo, you'll not only greatly increase your chances of getting in, but you'll get many more hits than you would otherwise. If you are already in Yahoo, don't despair; my initial listing in Yahoo was awful, but I managed to double the number of clickthroughs I get from them by successfully requesting a change to my listing. Writing the site description you submit to Yahoo is the single most important step you will ever take during site promotion, so spending some extra time on this step is highly recommended. Submit to the general indexes. There are many "2nd-tier" indexes that are worth submitting to, though not worth crafting a specially optimized listing for (although the advice in the previous step is still valid). I've broken these down into a variety of categories (including general indexes, british and canadian-specific indexes and search engines, international indexes, and special-purpose indexes) so that you can submit to them in small chunks as time permits. SelfPromotion.com can autosubmit to over 50 of these, and provides manual links to several hundred more (mostly the special-purpose and international/foreign- language ones).
The effects of the dot-com crash are quite interesting. Since the spring of 2000, about 40 2nd-tier indexes and searchengines (and several of the big guys) that I used to auto-submit to have curled up and died. But there's still plenty to submit to.
Consider paying for hits. The good news about listing in the search engines and indexes is that it's free. The bad news is that you don't have much control. While it is certainly worthwhile to tweak your pages in search of high rankings, it's not always possible to get the ones you want. There are, however, several places that can provide you with well-targeted traffic for pennies a visitor. My favorite is Overture.comFrequently Asked QuestionWhat's the difference between a search engine and an index?
A search engine (eg: Altavista) is a database of webpages. You give them your URL, and they read your page, extract relevant information from it, and store it in their database. Many search engines also run "spiders" that roam around the net looking for new pages.
An index (eg: Yahoo) is a database of web sites. Your listing in an index depends on what you tell them in your submission, not what is on your page.
Many people expect to instantly get a lot more hits after registering their site. This will not happen, and anyone who promises such results is what James (my 8 year-old) colorfully refers to as a "silly peanut-butter" (in other words, a liar).
Registering your site will get you more hits, but it will take weeks or months for the increased traffic to become noticeable. After you register your site with a particular search engine or index, it can take anything from 5 seconds to 5 months for them to actually list you, if in fact they actually do. Registering at a particular index doesn't guarantee that you'll actually be listed; many indexes are very selective and only list a small percentage of the submissions they get. And the major search engines? Well, they're all taking months to add new listings. Furthermore, just because you are listed by a particular site doesn't mean that your listing will appear on any particular search.Submitting to Google
Google is the best engine for spidering through sites. It's only necessary to add your main URL and Google will eventually spider the rest of your site. You can check if Google has indexed your site by typing your URL into their search box. They currently take about one-to-two months to index and reindex pages. Representatives from Google have warned that they prefer to find sites and pages on their own, rather than you submitting them through their form. However, they also state that there is no penalty for submitting to them through their form. Google has no immediate plans to start a pay-for-inclusion program. Google's Add URL form can be found here <http://www.google.com/addurl.html> .
Submitting to Hotbot/Inktomi
At HotBot, you can type www.yoursite.com into their search box and see if it shows up. Since HotBot uses the Inktomi database for its search results, you can simply wait for the Inktomi spider to find your site. If you've been listed in the LookSmart directory, the Inktomi spider should easily find you in no time. You can also submit to them through any Inktomi partner site. As with most of the search engines, HotBot/Inktomi does not guarantee that they will list all sites that are submitted to them for free. For guaranteed inclusion in their database, you might want to use the Inktomi Paid Inclusion Program <http://www.inktomi.com/products/search/pagesubmission.html> . Basically, Inktomi has partnered with some submission companies that, for a fee, guarantee the Inktomi spider will crawl your submitted pages every 48 hours. This will also keep your URL in their database for a year. You won't get any boost in rankings, but you'll have peace of mind knowing your site is listed.
Submitting to AltaVista
AltaVista used to be one the quickest to index new pages. However, those days are long gone, unless you use their paid inclusion program. In order to keep the amount of spam submittals in check, AV has instituted an interesting submittal procedure. When you submit a URL to AV's free "Add-URL form," you'll see a graphical image of some strange- looking letters. You are instructed to type those letters into a box in order to continue the submission. Once you do that, you can then submit up to five URLs. I think it's a neat idea, to thwart automatic submission programs. Since I hand submit anyway, it's not a problem for me. As long as it means that my submissions get added in a relatively short time period, I'm all for it!
To see if your pages are in AltaVista, type in "www.yoursite.com" into their search box (substituting your own domain name for yoursite.com). Your indexed pages will show up. Compare the title and description AV lists, with the Title tag and Meta description tag on your actual pages. If AV shows your old title and/or description, then they haven't yet indexed your new page. If they show your new stuff, then you're all set! Links to AltaVista's various submission programs can be viewed here <http://www.altavista.com/sites/search/addurl> .
Submitting to Lycos/FAST
Sites have been getting added fairly quickly and flawlessly as of late to Lycos. You can usually type your URL into their search box to see if it's in their database.. Lycos has partnered with FAST, whose spider crawls through pages on a regular basis. Very often FAST will find your inner pages without you needing to submit them at all. Lycos/FAST also recently instituted a new pay-for-inclusion program called, InSite Select which will guarantee your pages regular 48 hour spidering. Lycos' add URL programs can be reached here <http://searchservices.lycos.com/searchservices/> .
Music News Health History Sports Archaeology Geography College Courses