Problem with dump over SSH: Operation timed out

Thu Aug 9 05:20:09 PDT 2007

On Thu, Aug 09, 2007 at 10:25:41AM +0200, Bram Schoenmakers wrote:

> Dear list,
> 
> There is a problem with performing a dump from our webserver at the data 
> centre to a backup machine at the office. Everytime we try to perform a dump, 
> the SSH tunnel dies:
> 
> # /sbin/dump -0uan -L -h 0 -f - / | /usr/bin/bzip2 | /usr/bin/ssh 
> backup at office.example.com \
>         dd of=/backup/webserver/root.0.bz2
> 
>   DUMP: Date of this level 0 dump: Wed Aug  8 20:58:51 2007
>   DUMP: Date of last level 0 dump: the epoch
>   DUMP: Dumping snapshot of /dev/da0s1a (/) to standard output
>   DUMP: mapping (Pass I) [regular files]
>   DUMP: mapping (Pass II) [directories]
>   DUMP: estimated 60746 tape blocks.
>   DUMP: dumping (Pass III) [directories]
>   DUMP: dumping (Pass IV) [regular files]
> Read from remote host office.example.com: Operation timed out
>   DUMP: Broken pipe
>   DUMP: The ENTIRE dump is aborted.

Note:    I have been getting something that looks very similar when I
try to dump a large file system - actually not all that large, just
about 30 GB - over the net to a different machine.  The one with the
tape is running a quite old FreeBSD - around 4.9 I think - and can't
be upgraded at the moment.  The one I am attempting to dump is on 6.1 - 
which I want to move to 6.2, but have been stalling because I haven't 
been able to get a good dump.

I don't have anything to add to Bram's facts here, just that the 
timeout like this is happening on another system too.

////jerry

> 
> Here are some facts about the situation:
> 
> * The client (where the dup takes place) runs FreeBSD 6.2-RELEASE-p4
> * The server (at the office) runs FreeBSD 6.1-RELEASE
> * Both hosts have ipf installed
> * Some IPF rules from the client:
> 
> 	pass out quick on bge0 proto tcp from any to any flags S keep state
> 	pass out quick on bge0 proto udp from any to any keep state
> 	pass out quick on bge0 proto icmp from any to any keep state
> 	pass out quick on bge0 proto gre from any to any keep state
> 	pass out quick on bge0 proto esp from any to any keep state
> 	pass out quick on bge0 proto ah from any to any keep state
> 
> 	block out quick on bge0 all
> 	pass in quick on bge0 proto tcp from any to any port = 22 flags S keep state
> 
> 	block return-rst in log quick on bge0 proto tcp from any to any
> 	block in quick on bge0 proto tcp all flags S
> 	block return-icmp-as-dest(port-unr) in log quick on bge0 proto udp from any 
> to any
> 	block in log quick on bge0 all
> 
> * Some IPF rules from the server:
> 
> 	pass out quick on re0 proto tcp from any to any flags S keep state
> 	pass out quick on re0 proto udp from any to any keep state
> 	pass out quick on re0 proto icmp from any to any keep state
> 	pass out quick on re0 proto gre from any to any keep state
> 	pass out quick on re0 proto esp from any to any keep state
> 	pass out quick on re0 proto ah from any to any keep state
> 	
> 	pass out quick on re0 proto tcp from any to any port = 22 keep state
> 
> 	pass in quick on re0 proto tcp from any to any port = 22 flags S keep state
> 
> 	block return-rst in quick on re0 proto tcp all flags S
> 	block in quick on re0 proto tcp all flags S
> 	block return-icmp-as-dest(port-unr) in log quick on re0 proto udp from any to 
> any
> 	block in log quick on re0 all
> 
> * I've tried with TCPKeepAlive off
> * Setting ClientAlive{Interval,CountMax} on the server did not improve things.
> * Setting ServerAlive{Interval,CountMax} on the client neither, although I got 
> a different error: 
> 
>   DUMP: Date of this level 0 dump: Wed Aug  8 21:05:26 2007
>   DUMP: Date of last level 0 dump: the epoch
>   DUMP: Dumping snapshot of /dev/da0s1f (/usr) to standard output
>   DUMP: mapping (Pass I) [regular files]
>   DUMP: mapping (Pass II) [directories]
>   DUMP: estimated 429177 tape blocks.
>   DUMP: dumping (Pass III) [directories]
>   DUMP: dumping (Pass IV) [regular files]
> Received disconnect from xxx.xxx.xxx.xxx: 2: Timeout, your session not 
> responding.
>   DUMP: Broken pipe
>   DUMP: The ENTIRE dump is aborted.
> 
> * A dump from the client machine to another server works fine. The receiving 
> host has a similar internet connection as the office (cable).
> * A dump from another webserver of ours, running FreeBSD-4.10-RELEASE in 
> another data centre can dump fine to the office with the same construction. 
> This webserver uses IPFW.
> * Uploading a big file (200M) over SFTP to the 6.2 webserver causes no 
> problems.
> * Downloading the very same big file over SCP causes problems too, below some 
> SCP debug output. The connection drops quickly after it gained a reasonable 
> download speed.
> 
> 	Read from remote host office.example.com: Connection reset by peer
> 	debug1: Transferred: stdin 0, stdout 0, stderr 77 bytes in 103.3 seconds
> 	debug1: Bytes per second: stdin 0.0, stdout 0.0, stderr 0.7
> 	debug1: Exit status -1
> 	lost connection
> 
> * Maybe the MTU value was the cause, but setting them to 1472 on both sides 
> didn't improve the situation as well.
> 
> So as you may see I've tried a lot of things in order to make the dump work, 
> but so far no luck. Probably I'm missing something crucial. I think it has 
> something to do with the statetables in the firewall, but I was not able to 
> succeed with that assumption.
> 
> Any suggestion is very welcome.
> 
> Kind regards,
> 
> -- 
> Bram Schoenmakers
> 
> What is mind? No matter. What is matter? Never mind.
> (Punch, 1855)
> _______________________________________________
> freebsd-questions at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "freebsd-questions-unsubscribe at freebsd.org"