Using Yahoo! or Google search bar instead of search.cgi

Wolfram Schneider wosch at FreeBSD.org
Tue Oct 11 13:28:32 PDT 2005


Tim Wilde wrote:

> (Apologies for breaking threading, just joined freebsd-www so I don't 
> have the appropriate messages for a References: header.)
>
> As I mentioned in my earlier post, I think an even bigger problem than 
> the one Murray mentioned can be observed by the fact that a search for 
> "kernel" returns no results at all.


I guess what happens here: "kernel" is a very common word (believe it or 
not).
google has 18.900 hits for the word "kernel" on www.freebsd.org.
Common words (e.g. "a", "the", "an", "www", "is") are usually
ignored by search engines to save space or to speed up searches.
These are known as "stop words." Even google has stop words.

 From my memory, search.cgi has a dynamic stop word list -
words which hit the limit of 20.000 will be ignored.

-Wolfram


> At DynDNS, we recently started indexing our site using ht://Dig 
> (http://www.htdig.org/), and have been very happy with the flexibility 
> it provides for tuning search results to get the most relevant 
> matches.  It is also a true spider, crawling the website over HTTP 
> rather than searching on the filesystem as the current search.cgi 
> seems to do.

-- 

Wolfram Schneider <wosch at FreeBSD.org> http://wolfram.schneider.org



More information about the freebsd-www mailing list