script to make webpage snapshot

Polytropon freebsd at edvax.de
Fri Aug 12 08:04:41 UTC 2016


On Thu, 11 Aug 2016 17:28:17 -0500 (CDT), Valeri Galtsev wrote:
> Dear Experts,
> 
> Could someone recommend a script or utility one can run from command line
> on Linux or UNIX machine to make a snapshot image of webpage?

When you say "snapshot", what exactly do you mean? I'm not sure
I understand your description correctly. Is a snapshot

(a) a _visual_ snapshot (image format or PDF) of how the web page
    renders inside a web browser, or

(b) an exactl local _copy_ (files and directories) on your disk?

For option (a), lang/phantomjs has been suggested. Check the
mailing list archives - I've been asking that kind if question
some years ago, but I cannot remember (or even find) the answers
I got. ;-)

For option (b), wget probably isn't bad, as long as you add some
options to avoid unneeded traffic, such as

	% wget -r -l 0 -k -nc <source>

If you are interested only in a specific sub-path, or subset of
file types (or want to reject them), use the -A or -R options.
Use -U to set the user agent string to a "real" web browser if
needed. See "man wget" for details.

This set of options should provide the ability to only "snapshot"
those elements of the web page content that have been changed.
Things you already have on your local disk won't be downloaded.



-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...


More information about the freebsd-questions mailing list