ATA 4K sector issues
Matthew Dillon
dillon at apollo.backplane.com
Wed Mar 17 21:50:38 UTC 2010
: There is a sysctl, md_compress, that I turned out in my tests, but not
:working as expected.
: Why using gnop -S 4096 works well?
:
:Thiago
You are setting the sector size to 4K with gnop -S 4096 so presumably
ZFS will not do any fragmented writes smaller than that. I'm not
sure why that would matter except possibly for ZIL writes. In the
case of ZIL if ZFS is using sector-sized writes (I don't know what it
actually uses) then setting the sector size to 4K would be more
efficient as the drive would not have to issue a read-before-write
when the disk cache is flushed after the ZIL write.
One important aspect of having the filesystem use a larger logical
block size, such as 4K or 16K or 32K etc, is that the filesystem
itself knows whether any trailing data is garbage or not and will
avoid doing a read-before-write when writing small amounts of data.
Most of the time if the filesystem is allocating space from its blockmap
it knows the trailing data in the block is garbage and will zero it
instead of performing a read-before-write. Also, the buffer cache covers
hundreds of megabytes verses the hard drive cache which is typically
only 8-64MB (though the OCZ Colosus has 128M). Still, this means the
kernel will do a much better job write-combining than the drive.
The drive has no knowledge of what is garbage and what is not at the
drive level, so the moment this stuff moves out of the drive and into
the kernel you reap rewards on these larger physical sector-sized drives.
-Matt
More information about the freebsd-hackers
mailing list