OT: wget bug

Joe R. Jah jjah at cloud.ccsf.cc.ca.us
Sat Jul 18 16:41:00 UTC 2009


On Sat, 18 Jul 2009, Andrew Brampton wrote:

> Date: Sat, 18 Jul 2009 12:52:07 +0100
> From: Andrew Brampton <brampton+freebsd at gmail.com>
> To: Joe R. Jah <jjah at cloud.ccsf.cc.ca.us>
> Cc: freebsd-questions at freebsd.org
> Subject: Re: OT: wget bug
>
> 2009/7/17 Joe R. Jah <jjah at cloud.ccsf.cc.ca.us>:
> >
> > Hello all,
> >
> > I want to wget a site at regular intervals and only get the updated pages,
> > so I use the this wget command line:
> >
> > wget -b -m -nH http://host.domain/Directory/file.html
> >
> > It works fine on the first try, but it fails on subsequent tries with the
> > following error message:
> >
> > --8<--
> > Connecting to host.domain ... connected.
> > HTTP request sent, awaiting response... 401 Unauthorized
> > Authorization failed.
> > --8<--
>
> This to me seems like the remote server is replying with 401. Perhaps
> wget is sending the If-Modified-Since HTTP header, and the remote
> server does not support this. I would confirm this by running tcpdump
> (or wireshark) to sniff the traffic and see what the remote server is
> replying with.
>
> If the remote server is truly returning 401, then you might either
> need to use an alternative tool, or configure wget differently.
>
> Hope this helps
> Andrew

Thank you Andrew.  Yes the server is truly returning 401.  I have already
reconfigured wget to download everything regardless of their timestamp,
but it's a waste of bandwidth, because most of the site is unchanged.

Do you know of any workaround in wget, or an alternative tool to ONLY
download newer files by http?

Regards,

Joe
-- 
     _/   _/_/_/       _/              ____________    __o
     _/   _/   _/      _/         ______________     _-\<,_
 _/  _/   _/_/_/   _/  _/                     ......(_)/ (_)
  _/_/ oe _/   _/.  _/_/ ah        jjah at cloud.ccsf.cc.ca.us


More information about the freebsd-questions mailing list