ZFS command can block the whole ZFS subsystem!

O. Hartmann ohartman at zedat.fu-berlin.de
Sat Jan 4 08:56:54 UTC 2014


On Fri, 3 Jan 2014 17:04:00 -0000
"Steven Hartland" <killing at multiplay.co.uk> wrote:

> ----- Original Message ----- 
> From: "O. Hartmann" <ohartman at zedat.fu-berlin.de>
> > On Fri, 3 Jan 2014 14:38:03 -0000
> > "Steven Hartland" <killing at multiplay.co.uk> wrote:
> > 
> > > 
> > > ----- Original Message ----- 
> > > From: "O. Hartmann" <ohartman at zedat.fu-berlin.de>
> > > > 
> > > > For some security reasons, I dumped via "dd" a large file onto
> > > > a 3TB disk. The systems is 11.0-CURRENT #1 r259667: Fri Dec 20
> > > > 22:43:56 CET 2013 amd64. Filesystem in question is a single ZFS
> > > > pool.
> > > > 
> > > > Issuing the command
> > > > 
> > > > "rm dumpfile.txt"
> > > > 
> > > > and then hitting Ctrl-Z to bring the rm command into background
> > > > via fg" (I use FreeBSD's csh in that console) locks up the
> > > > entire command and even worse - it seems to wind up the pool in
> > > > question for being exported!
> > > 
> > > I cant think of any reason why backgrounding a shell would export
> > > a pool.
> > 
> > I sent the job "rm" into background and I didn't say that implies an
> > export of the pool!
> > 
> > I said that the pool can not be exported once the bg-command has
> > been issued. 
> 
> Sorry Im confused then as you said "locks up the entire command and
> even worse - it seems to wind up the pool in question for being
> exported!"
> 
> Which to me read like you where saying the pool ended up being
> exported.

I'm not a native English speaker. My intention was, to make it short:

renove the dummy file. While having issued the command in the
foreground of the terminal, I decided a second later after hitting
return, to send it in the background via suspending the rm-command and
issuing "bg" then.


> 
> > > > I expect to get the command into the background as every other
> > > > UNIX command does when sending Ctrl-Z in the console.
> > > > Obviously, ZFS related stuff in FreeBSD doesn't comply. 
> > > > 
> > > > The file has been removed from the pool but the console is still
> > > > stuck with "^Z fg" (as I typed this in). Process list tells me:
> > > > 
> > > > top
> > > > 17790 root             1  20    0  8228K  1788K STOP   10   0:05
> > > > 0.00% rm
> > > > 
> > > > for the particular "rm" command issued.
> > > 
> > > Thats not backgrounded yet otherwise it wouldnt be in the state
> > > STOP.
> > 
> > As I said - the job never backgrounded, locked up the terminal and
> > makes the whole pool inresponsive.
> 
> Have you tried sending a continue signal to the process?

No, not by intention. Since the operation started to slow down the
whole box and seemed to influence nearly every operation with ZFS pools
I intended (zpool status, zpool import the faulty pool, zpool export) I
rebootet the machine.

After the reboot, when ZFS came up, the drive started working like
crazy again and the system stopped while in recognizing the ZFS pools.
I did then a hard reset and restarted in single user mode, exported the
pool successfully, and rebooted. But the moment I did an zpool import
POOL, the heavy working continued.

> 
> > > > Now, having the file deleted, I'd like to export the pool for
> > > > further maintainance
> > > 
> > > Are you sure the delete is complete? Also don't forget ZFS has
> > > TRIM by default, so depending on support of the underlying
> > > devices you could be seeing deletes occuring.
> > 
> > Quite sure it didn't! It takes hours (~ 8 now) and the drive is
> > still working, although I tried to stop. 
> 
> A delete of a file shouldn't take 8 hours, but you dont say how large
> the file actually is?

The drive has a capacity of ~ 2,7 TiB (Western Digital 3TB drive). The
file I created was, do not laugh, please, 2,7 TB :-( I guess depending
on COW technique and what I read about ZFS accordingly to this thread
and others, this seems to be the culprit. There is no space left to
delete the file savely.

By the way - the box is still working on 100% on that drive :-( That's
now > 12 hours.
 

> 
> > > You can check that gstat -d
> > 
> > command report 100% acticity on the drive. I exported the pool in
> > question in single user mode and now try to import it back while in
> > miltiuser mode.
> 
> Sorry you seem to be stating conflicting things:
> 1. The delete hasnt finished
> 2. The pool export hung
> 3. You have exported the pool
> 

Not conflicting, but in my non-expert terminology not quite accurate
and precise as you may expect.

ad item 1) I terminated (by the brute force of the mighty RESET button)
the copy command. It hasn't finished the operation on the pool as I can
see, but it might be a kind of recovery mechanism in progress now, not
the rm-command anymore.

ad 2) Yes, first it hung, then I reset the box, then in single user
mode the export to avoid further interaction, then I tried to import
the pool again ...
ad 3) yes, successfully after the reset, now I imported the pool and
the terminal, in which I issued the command is still stuck again while
the pool is under heavy load.













> What exactly is gstat -d reporting, can you paste the output please.

I think this is boring looking at 100% activity, but here it is ;-)


dT: 1.047s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada0
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada1
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada2
   10    114    114    455   85.3      0      0    0.0      0      0    0.0  100.0| ada3
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada4
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| cd0
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada0p1
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada0p2
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada0p3
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada0p4
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada0p5
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada0p6
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada0p7
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada0p8
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada0p9
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada0p10
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada0p11
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada0p12
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada0p13
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada0p14
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| gpt/boot
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| gptid/c130298b-046a-11e0-b2d6-001d60a6fa74
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| gpt/root
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| gpt/swap
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| gptid/fa3f37b1-046a-11e0-b2d6-001d60a6fa74
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| gpt/var
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| gpt/var.tmp
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| gpt/usr
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| gpt/usr.src
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| gpt/usr.obj
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| gpt/usr.ports
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| gpt/data
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| gpt/compat
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| gpt/var.mail
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| gpt/usr.local
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada1p1
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada2p1
   10    114    114    455   85.3      0      0    0.0      0      0    0.0  100.0| ada3p1
    0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| ada4p1

> 
> > Shortly after issuing the command
> > 
> > zpool import POOL00
> > 
> > the terminal is stuck again, the drive is working at 100% for two
> > hours now and it seems the great ZFS is deleting every block per
> > pedes. Is this supposed to last days or a week?
> 
> What controller and what drive?

Hardware is as follows:
CPU: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz (3201.89-MHz K8-class CPU)
real memory  = 34359738368 (32768 MB)
avail memory = 33252507648 (31712 MB)
ahci1: <Intel Patsburg AHCI SATA controller> port 0xf090-0xf097,0xf080-0xf083,0xf070-0xf077,0xf060-0xf063,0xf020-0xf03f mem 0xfb520000-0xfb5207ff irq 20 at device 31.2 on pci0
ahci1: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported
ahcich8: <AHCI channel> at channel 0 on ahci1
ahcich9: <AHCI channel> at channel 1 on ahci1
ahcich10: <AHCI channel> at channel 2 on ahci1
ahcich11: <AHCI channel> at channel 3 on ahci1
ahcich12: <AHCI channel> at channel 4 on ahci1
ahcich13: <AHCI channel> at channel 5 on ahci1
ahciem0: <AHCI enclosure management bridge> on ahci1


> 
> What does the following report:
> sysctl kstat.zfs.misc.zio_trim

sysctl kstat.zfs.misc.zio_trim
kstat.zfs.misc.zio_trim.bytes: 0
kstat.zfs.misc.zio_trim.success: 0
kstat.zfs.misc.zio_trim.unsupported: 507
kstat.zfs.misc.zio_trim.failed: 0

> 
> > > > but that doesn't work with
> > > > 
> > > > zpool export -f poolname
> > > > 
> > > > This command is now also stuck blocking the terminal and the
> > > > pool from further actions.
> > > 
> > > If the delete hasnt completed and is stuck in the kernel this is
> > > to be expected.
> > 
> > At this moment I will not imagine myself what will happen if I have
> > to delete several deka terabytes. If the weird behaviour of the
> > current system can be extrapolated, then this is a no-go.
> 
> As I'm sure you'll appreciate that depends if the file is simply being
> unlinked or if each sector is being erased, the answers to the above
> questions should help determine that :)

You're correct in that. But sometimes I'd like to appreciate to have the choice.

> 
>  Regards
>  Steve


Regards,

Oliver

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-current/attachments/20140104/f4609027/attachment.sig>


More information about the freebsd-current mailing list