Need a good Unix script that..

Chuck Swiger cswiger at mac.com
Fri Jul 29 15:22:34 GMT 2005


Michael Sharp wrote:
> I need a simple sh script that will daily (via cron) crawl a website
> looking for multiple keywords, then reporting those keyword results and
> URL to an email address.
> 
> Anyone know of a pre-written script that does this, or point me in the
> right direction in using the FreeBSD core commands that can accomplish
> this?

If you feed the webserver's access log into various programs like analog, these 
will report on the keywords people used to search for when linking into the 
site.  (This is not quite what you asked for, but I mention it because the 
suggestion might be closer to what you want to see... :-)

Anyway, if you do not own the site & have access to the logfiles, you ought to 
honor things like /robots.txt and the site's policies with regard to copyright 
and datamining, but you could easily use lynx, curl, or anything similiar which 
supports a recursive/web-spider download capability, and then grep for 
keywords, do histograms, whatever on the content you DL.

-- 
-Chuck




More information about the freebsd-questions mailing list