fsck_ufs locked in snaplk
Kris Kennaway
kris at obsecurity.org
Mon Apr 24 18:16:12 UTC 2006
On Mon, Apr 24, 2006 at 10:04:57PM +0400, Dmitry Morozovsky wrote:
> On Mon, 24 Apr 2006, Dmitry Morozovsky wrote:
>
> DM> kKK> > one of my servers had to be rebooted uncleanly and then I have backgrounded
> DM> KK> > fsck locked for more than an our in snaplk:
> DM> KK> >
> DM> KK> > 742 root 1 -4 4 1320K 688K snaplk 0:02 0.00% fsck_ufs
> DM> KK> >
> DM> KK> > File system in question is 200G gmirror on SATA. Usually making a snapshot
> DM> KK> > (e.g., for making dumps) consumes 3-4 minutes for that fs, so it seems to me
> DM> KK> > that filesystem is in a deadlock.
> DM> KK>
> DM> KK> Is the process performing I/O? Background fsck deliberately runs at a
> DM> KK> slow rate so it does not destroy I/O performance on the rest of the
> DM> KK> system.
> DM>
> DM> Nope. For that case, 50+ smbds had been locked in 'ufs' state, so I've been
> DM> urged to revive the machine and reboot, turning off bgfsck.
> DM>
> DM> This night, dump -L locks in the same position on the same filesystem:
> DM>
> DM> 0 2887 2886 0 -4 0 1260 692 snaplk D ?? 0:01.28
> DM> /sbin/mksnap_ffs root 0.0 0.1 5:19AM
> DM>
> DM> it has been started at 5:19am, and now is 9:20 - no disk activity
> DM>
> DM>
> DM> For the reference: it's fresh RELENG_6_1/i386.
>
> Just rechecked it: did mksnap_ffs on an otherwise idle file system:
>
> marck at office:/> mksnap_ffs /st /st/.snap/test_snapshot
> load: 0.02 cmd: mksnap_ffs 4012 [biord] 0.00u 0.04s 0% 696k
> load: 0.04 cmd: mksnap_ffs 4012 [biord] 0.00u 0.44s 0% 696k
> load: 0.21 cmd: mksnap_ffs 4012 [snaprdb] 0.00u 1.17s 0% 696k
> load: 0.20 cmd: mksnap_ffs 4012 [snaprdb] 0.00u 1.23s 0% 696k
> load: 0.13 cmd: mksnap_ffs 4012 [snaplk] 0.00u 1.30s 0% 696k
> load: 0.08 cmd: mksnap_ffs 4012 [snaplk] 0.00u 1.30s 0% 696k
> load: 0.01 cmd: mksnap_ffs 4012 [snaplk] 0.00u 1.30s 0% 696k
>
> (I hit ^T several times)
>
> biord phase consumes about 1.5-2 mins,
> snaprdb phase - about 30-40 secs, and then process died. Most disk requests
> succeeds; however, accessing /st/.snap locks process in ufs state forever.
>
> What bothers me most is that it is the only machine reproducibly hangs in
> snapshots, and it did not hang before RELENG_5 -> RELENG_6 upgrade. Other
> RELENG_6 machines do snapshot backups flawlessly (knock-on-wood!)
Are you quite certain it's running up-to-date RELENG_6_1? All known
snapshot deadlock issues were believed to have been fixed a few weeks
ago. If so, we might need you to enable extra debugging to track this
down.
Kris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20060424/55ab7f67/attachment.pgp
More information about the freebsd-stable
mailing list