Script help needed please

Jack L. Stone jackstone at sage-one.net
Thu Aug 14 06:49:49 PDT 2003


Server Version: Apache/1.3.27 (Unix) FrontPage/5.0.2.2510 PHP/4.3.1
The above is typical of the servers in use, and with csh shells employed,
plus IPFW.

My apologies for the length of this question, but the background seems
necessary as brief as I can make it so the question makes sense.

The problem:
We have several servers that provide online reading of Technical articles
and each have several hundred MB to a GB of content.

When we started providing the articles 6-7 years ago, folks used browsers
to read the articles. Now, the trend has become a more lazy approach and
there is an increasing use of those download utilities which can be left
unattended to download entire web sites taking several hours to do so.
Multiply this by a number of similar downloads and there goes the
bandwidth, denying those other normal online readers the speed needed for
loading and browsing in the manner intended. Several hundred will be
reading at a time and several 1000 daily.

Further, those download utilities do not discriminate on the files
downloaded unless the user sets them to exclude certain types of files they
don't need for the articles. All or most don't bother to set the
parameters. They just turn them loose and go about their day. Essentially a
DoS for normal readers who notice the slowdown, but not with malice.

This method downloads a tremendous amount of unnecessary content. Some
downloaders have been contacted to stop (if we spot an email address from a
login) and in response they simply weren't aware of the problems they were
making and agreed to at least spread downloads over longer periods of time.
I can live with that.

A possible solution?
Now, my question: Is it possible to write a script that can constantly scan
the Apache logs to look for certain footprints of those downloaders,
perhaps the names, like "HTTRACK", being one I see a lot. Whenever I see
one of those sessions, I have been able to abort them by adding a rule to
the firewall to deny the IP address access to the server. This aborts the
downloading, but have seen the attempts constantly continue for a day or
two, confirming unattended downloads.

Thus, if the script could spot an "offender" and then perhaps make use of
the firewall to add a rule containing the offender's IP address and then
flush to reset the firewall, this would at least abort the download and
free up the bandwidth (I already have a script that restarts the firewall).

Is this possible and how would I go about it....???

Many thanks for any ideas on this!

Best regards,
Jack L. Stone,
Administrator

SageOne Net
http://www.sage-one.net
jackstone at sage-one.net


More information about the freebsd-questions mailing list