OT: wget bug

Karl Vogel vogelke+unix at pobox.com
Sat Jul 18 23:35:26 UTC 2009


>> On Sat, 18 Jul 2009 09:41:00 -0700 (PDT), 
>> "Joe R. Jah" <jjah at cloud.ccsf.cc.ca.us> said:

J> Do you know of any workaround in wget, or an alternative tool to ONLY
J> download newer files by http?

   "curl" can help for things like this.  For example, if you're getting
   just a few files, fetch only the header and check the last-modified date:

      me% curl -I http://curl.haxx.se/docs/manual.html
      HTTP/1.1 200 OK
      Proxy-Connection: Keep-Alive
      Connection: Keep-Alive
      Date: Sat, 18 Jul 2009 23:24:24 GMT
      Server: Apache/2.2.3 (Debian) mod_python/3.2.10 Python/2.4.4
      Last-Modified: Mon, 20 Apr 2009 17:46:02 GMT
      ETag: "5d63c-b2c5-1a936a80"
      Accept-Ranges: bytes
      Content-Length: 45765
      Content-Type: text/html; charset=ISO-8859-1

   You can download files only if the remote one is newer than a local copy:

      me% curl -z local.html http://remote.server.com/remote.html

   Or only download the file if it was updated since Jan 12, 2009:

      me% curl -z "Jan 12 2009" http://remote.server.com/remote.html

   Curl tries to use persistent connections for transfers, so put as many
   URLs on the same line as you can if you're looking to mirror a site.  I
   don't know how to make curl do something like walking a directory for a
   recursive download.

   You can get the source at http://curl.haxx.se/download.html

-- 
Karl Vogel                      I don't speak for the USAF or my company

If lawyers are disbarred and clergymen defrocked, doesn't it follow
that electricians can be delighted, musicians denoted, cowboys deranged,
models deposed, tree surgeons debarked and dry cleaners depressed?


More information about the freebsd-questions mailing list