[Bug 212139] r298900 introduced a fatal failure case for >2TB disk size reporting bugs

Thu Aug 25 12:34:22 UTC 2016

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=212139

            Bug ID: 212139
           Summary: r298900 introduced a fatal failure case for >2TB disk
                    size reporting bugs
           Product: Base System
           Version: 11.0-RC1
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: freebsd-bugs at FreeBSD.org
          Reporter: peter at FreeBSD.org

Created attachment 174052
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=174052&action=edit
Hack workaround

We have machines in the freebsd.org cluster that have 3TB SATA drives.

ada0: 2861588MB (5860533168 512 byte sectors)
However, the bios reports them as:
    disk0:   BIOS drive C (1565565872 X 512):
ie: the 3TB drive is reported to the loader as 1TB.

Prior to r298900, this was harmless.  IO was issued relative to the metadata on
the disk.

r298900 changed it from working to a fatal error:
+    if (dblk >= BD(dev).bd_sectors) {
+       DEBUG("IO past disk end %llu", (unsigned long long)dblk);
+       return (EIO);
+    }
and it won't even try.  This makes machines that used to work (in spite of a
bios reporting bug) suddenly fail with an IO error.  While this was observed
with ZFS booting, it will affect UFS the same way as they share this code if it
tries to read data beyond the truncated size.

I have attached a horrible hack that works for the affected machines in the
freebsd.org package build cluster.  It is not an ideal solution but people may
find it useful.

The patch is a hack to restrict attempted reads beyond the end of the disk to
one single sector rather than a hard fail.  This should make it behave the same
way as old versions of the bcache code.  If the bios generates an error, it
would do so the same as it did with the old code.  Using a single sector
prevents read-ahead amplifying delays.

A better solution might be to have the file system / partition drivers instead
tell bcache what size to expect so that it can avoid doing read-aheads beyond
the end of a partition.  If a 3TB GPT is on a disk, that should be used for IO
and readahead clipping, not the historically unreliable bios sector count.
Differences could be reported to the user.

This problem is in 11.0-RC1, 11-stable and 12-current.

-- 
You are receiving this mail because:
You are the assignee for the bug.