ZFS bug: was creating ZIL ignores vfs.zfs.min_auto_ashift, should be ZIL sets improper ashift with AHCI controllers
Steven Hartland
killing at multiplay.co.uk
Thu Nov 6 20:26:26 UTC 2014
Something very strange going on.
I have a boot pool (tank) and if I add ada1p3 (512b disk with
min_auto_ashift = 12) to it as a log device zdb reports its ashift as 9.
If I add the same device to another test pool (tpool) on the same
machine it gets ashift 12.
The attached dtrace script traces the calls and shows that
vdev_ashift_optimize is correctly called and that the ashift of the vdev
in both cases should be 12 according to the final vdev_config_generate call.
More debugging required
On 06/11/2014 14:58, Borja Marcos wrote:
> On Nov 6, 2014, at 2:26 PM, Steven Hartland wrote:
>
>> That's not relevant as min when set should override the drives params
> There is more to this than it seems, I just found more funny stuff.
>
> MY CONCLUSION IS: when creating a ZIL device, it behaves differently depending on the disk controller. It works with SAS,
> and it doesn't work with AHCI.
>
> When using an AHCI controller, ZIL ignores *both* the 4K block quirk and the min_auto_ashift variables. Ashift is fixed to 9. It only
> uses a different ashift when using a "nop" device. For example, I have tried with a 4 KB gnop device and this time it used the correct ashift, 12.
>
> When using a SAS controller, ZIL works perfectly with both.
>
> Seems quite odd to me. I am not running exactly the same version on both machines (the one with AHCI controllers is running -STABLE
> from three days ago) and the one with the SAS controller is running 10.1-RC4. But the results should be the same.
>
>
>
>
>
> I've added the relevant quirk to ata_da.c and the SSD is now
> properly "quirked":
>
> ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
> ada1: <INTEL SSDSA2CT040G3 4PC10362> ATA-8 SATA 2.x device
> ada1: Serial Number PEPR408501DV040AGN
> ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
> ada1: Command Queueing enabled
> ada1: 38166MB (78165360 512 byte sectors: 16H 63S/T 16383C)
> ada1: quirks=0x1<4K>
>
>
> But still something is wrong:
>
> EXAMPLE ONE: AHCI controller, min_auto_ashift with the default value of 9.
>
> The log child, has the wrong ashift, 9, regardless of the 4K quirk.
>
> children[1]:
> type: 'disk'
> id: 1
> guid: 2447450905312007897
> path: '/dev/ada1'
> phys_path: '/dev/ada1'
> whole_disk: 1
> metaslab_array: 0
> metaslab_shift: 0
> ashift: 9
> asize: 40015757312
> is_log: 1
> create_txg: 11741519
>
>
> EXAMPLE 2: AHCI controller, raise min_auto_ashift to 12
>
> # sysctl vfs.zfs.min_auto_ashift=12
> vfs.zfs.min_auto_ashift: 9 -> 12
>
> # zpool add rpool log ada1
>
> And our log child still has the wrong ashift.
>
> children[1]:
> type: 'disk'
> id: 1
> guid: 17598938711972588792
> path: '/dev/ada1'
> phys_path: '/dev/ada1'
> whole_disk: 1
> metaslab_array: 0
> metaslab_shift: 0
> ashift: 9
> asize: 40015757312
> is_log: 1
> create_txg: 11741560
>
>
>
> EXAMPLE 3: Doing the same as example one, but using a SAS controller (mps).
> I haven't changed the min_auto_ashift.
>
> da3: <ATA Samsung SSD 840 BB0Q> Fixed Direct Access SCSI-6 device
> da3: Serial Number S1D9NEADA08568E
> da3: 600.000MB/s transfers
> da3: Command Queueing enabled
> da3: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C)
> da3: quirks=0x8<4K>
> da1: <ATA Samsung SSD 840 BB0Q> Fixed Direct Access SCSI-6 device
> da1: Serial Number S1D9NEADA08549F
> da1: 600.000MB/s transfers
> da1: Command Queueing enabled
> da1: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C)
> da1: quirks=0x8<4K>
> da2: <ATA Samsung SSD 840 BB0Q> Fixed Direct Access SCSI-6 device
> da2: Serial Number S1D9NEADA08548T
> da2: 600.000MB/s transfers
> da2: Command Queueing enabled
> da2: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C)
> da2: quirks=0x8<4K>
>
>
> Now, we create a pool. I did this in two steps in order to reproduce my AHCI more accurately.
>
> # zpool create sample mirror da2 da3
>
> and add a log device
>
> # zpool add sample log da1
>
> And our log device uses the ashift...
>
> children[1]:
> type: 'disk'
> id: 1
> guid: 1327562712929751294
> path: '/dev/da1'
> phys_path: '/dev/da1'
> whole_disk: 1
> metaslab_array: 38
> metaslab_shift: 33
> ashift: 12 <=============== BINGO! 12!!
> asize: 1000199946240
> is_log: 1
> create_txg: 7
>
>
> EXAMPLE 4: Same hardware as before, but I have compiled a "dequirked" kernel. The Samsung 840 SSD is now
> detected with 512 byte sectors.
>
> # sysctl vfs.zfs.min_auto_ashift=12
>
> # zpool create sample da2 da3
>
> # zpool add sample log da1
>
> # zdb
>
> sample:
> version: 5000
> name: 'sample'
> state: 0
> txg: 10
> pool_guid: 10244789911221894670
> hostid: 1065071139
> hostname: 'elibm'
> vdev_children: 3
> vdev_tree:
> type: 'root'
> id: 0
> guid: 10244789911221894670
> create_txg: 4
> children[0]:
> type: 'disk'
> id: 0
> guid: 147759032286414284
> path: '/dev/da2'
> phys_path: '/dev/da2'
> whole_disk: 1
> metaslab_array: 37
> metaslab_shift: 33
> ashift: 12
> asize: 1000199946240
> is_log: 0
> create_txg: 4
> children[1]:
> type: 'disk'
> id: 1
> guid: 2632519121370708463
> path: '/dev/da3'
> phys_path: '/dev/da3'
> whole_disk: 1
> metaslab_array: 34
> metaslab_shift: 33
> ashift: 12
> asize: 1000199946240
> is_log: 0
> create_txg: 4
> children[2]:
> type: 'disk'
> id: 2
> guid: 10136980984141171426
> path: '/dev/da1'
> phys_path: '/dev/da1'
> whole_disk: 1
> metaslab_array: 39
> metaslab_shift: 33
> ashift: 12 <========= 12, ashift for the log device
> asize: 1000199946240
> is_log: 1
> create_txg: 8
> features_for_read:
> com.delphix:hole_birth
> com.delphix:embedded_data
> root at elibm:~ #
>
-------------- next part --------------
#!/usr/sbin/dtrace -s
fbt::vdev_ashift_optimize:entry {
vd = (vdev_t *)arg0;
printf("vdev: %s, ashift: %d, physical_ashift: %d, top: %d, min: %d",
vd->vdev_path ? stringof(vd->vdev_path) : "n/a",
vd->vdev_ashift,
vd->vdev_physical_ashift,
vd == vd->vdev_top,
`zfs_min_auto_ashift
);
}
fbt::vdev_config_generate:entry {
vd = (vdev_t *)arg1;
printf("vdev: %s, ashift: %d, physical_ashift: %d, top: %d, min: %d",
vd->vdev_path ? stringof(vd->vdev_path) : "n/a",
vd->vdev_ashift,
vd->vdev_physical_ashift,
vd == vd->vdev_top,
`zfs_min_auto_ashift
);
}
fbt::vdev_ashift_optimize:return {
printf("%x", arg0);
}
More information about the freebsd-fs
mailing list