Robert Watson rwatson at
Wed Jun 9 04:08:00 GMT 2004

On Tue, 8 Jun 2004, Randy Bush wrote:

> i try to dump a live filesys with
>    /sbin/rdump 0Luaf remote.sys:/backup/foo/var.dump  /dev/twed0s1d
> the screen does
>   DUMP: Connection to remote.sys established.
>   DUMP: Date of this level 0 dump: Wed Jun  9 01:52:32 2004
>   DUMP: Date of last level 0 dump: the epoch
>   DUMP: Dumping snapshot of /dev/twed0s1d (/var) to /backup/foo/var.dump on host remote.sys
>   DUMP: mapping (Pass I) [regular files]
>   DUMP: mapping (Pass II) [directories]
>   DUMP: estimated 513616 tape blocks.
>   DUMP: dumping (Pass III) [directories]
>   DUMP: dumping (Pass IV) [regular files]
> and then just hangs forever
> there is nothing in /var/.snap

dump unlinks the snapshot as soon as it opens it so that it will be GC'd
when dump is done...

> i can
>     mount -u -o snapshot /var/.snap/snap1 /var

And this doesn't hang, or does?

> this is horrifyingly reproducable and happens with other source
> filesystems. 

I'm using dump with -L regularly with a recent -CURRENT box at work for
backups.  On the whole, it's worked without problems, but I did once
observe dump hanging, in the sbwait state (suggesting it's blocked on
socket I/O?).  At the time, I was unable to do much diagnosis.  Could you
do a couple of things:

(1) Could you do a "ps awxl" and see what wait channel dump is blocked on?

(2) Could you break into DDB and generate a stack trace for dump?

(3) Could you run "show lockedvnods" in DDB and show the results?

(4) Could you run "show locks <pid>" on the dump process?


