OT: wget bug
Joe R. Jah
jjah at cloud.ccsf.cc.ca.us
Sat Jul 18 16:41:00 UTC 2009
On Sat, 18 Jul 2009, Andrew Brampton wrote:
> Date: Sat, 18 Jul 2009 12:52:07 +0100
> From: Andrew Brampton <brampton+freebsd at gmail.com>
> To: Joe R. Jah <jjah at cloud.ccsf.cc.ca.us>
> Cc: freebsd-questions at freebsd.org
> Subject: Re: OT: wget bug
> 2009/7/17 Joe R. Jah <jjah at cloud.ccsf.cc.ca.us>:
> > Hello all,
> > I want to wget a site at regular intervals and only get the updated pages,
> > so I use the this wget command line:
> > wget -b -m -nH http://host.domain/Directory/file.html
> > It works fine on the first try, but it fails on subsequent tries with the
> > following error message:
> > --8<--
> > Connecting to host.domain ... connected.
> > HTTP request sent, awaiting response... 401 Unauthorized
> > Authorization failed.
> > --8<--
> This to me seems like the remote server is replying with 401. Perhaps
> wget is sending the If-Modified-Since HTTP header, and the remote
> server does not support this. I would confirm this by running tcpdump
> (or wireshark) to sniff the traffic and see what the remote server is
> replying with.
> If the remote server is truly returning 401, then you might either
> need to use an alternative tool, or configure wget differently.
> Hope this helps
Thank you Andrew. Yes the server is truly returning 401. I have already
reconfigured wget to download everything regardless of their timestamp,
but it's a waste of bandwidth, because most of the site is unchanged.
Do you know of any workaround in wget, or an alternative tool to ONLY
download newer files by http?
_/ _/_/_/ _/ ____________ __o
_/ _/ _/ _/ ______________ _-\<,_
_/ _/ _/_/_/ _/ _/ ......(_)/ (_)
_/_/ oe _/ _/. _/_/ ah jjah at cloud.ccsf.cc.ca.us
More information about the freebsd-questions