Need a good Unix script that..
Chuck Swiger
cswiger at mac.com
Fri Jul 29 15:22:34 GMT 2005
Michael Sharp wrote:
> I need a simple sh script that will daily (via cron) crawl a website
> looking for multiple keywords, then reporting those keyword results and
> URL to an email address.
>
> Anyone know of a pre-written script that does this, or point me in the
> right direction in using the FreeBSD core commands that can accomplish
> this?
If you feed the webserver's access log into various programs like analog, these
will report on the keywords people used to search for when linking into the
site. (This is not quite what you asked for, but I mention it because the
suggestion might be closer to what you want to see... :-)
Anyway, if you do not own the site & have access to the logfiles, you ought to
honor things like /robots.txt and the site's policies with regard to copyright
and datamining, but you could easily use lynx, curl, or anything similiar which
supports a recursive/web-spider download capability, and then grep for
keywords, do histograms, whatever on the content you DL.
--
-Chuck
More information about the freebsd-questions
mailing list