Divide-by-zero in loader
mdf356 at gmail.com
Fri Jan 28 19:35:57 UTC 2011
On Fri, Jan 28, 2011 at 11:23 AM, John Baldwin <jhb at freebsd.org> wrote:
> On Friday, January 28, 2011 2:14:45 pm Matthew Fleming wrote:
>> On Fri, Jan 28, 2011 at 11:00 AM, John Baldwin <jhb at freebsd.org> wrote:
>> > On Friday, January 28, 2011 12:41:08 pm Matthew Fleming wrote:
>> >> I spent a few days chasing down a bug and I'm wondering if a loader
>> >> change would be appropriate.
>> >> So we have these new front-panel LCDs, and like everything these days
>> >> it's a SoC. Normally it presents to FreeBSD as a USB communications
>> >> device (ucom), but when the SoC is sitting in its own boot loader, it
>> >> presents as storage (umass). If the box is rebooted in this state,
>> >> the reboot gets into /boot/loader and then reboots itself. (It took a
>> >> few days just to figure out I was getting into /boot/loader, since the
>> >> only prompt I could definitively stop at was boot2).
>> >> Anyways, I eventually debugged it to the device somehow presenting
>> >> itself to /boot/loader with a geometry of 1024/256/0, and since od_sec
>> >> is 0 that causes a divide-by-zero error in bd_io() while the loader is
>> >> trying to figure out if this is GPT or MBR formatted. We're still
>> >> trying to figure out why the loader sees this incorrect geometry.
>> >> But meanwhile, this patch fixes the issue, and I wonder if it would be
>> >> a useful safety-belt for other devices where an incorrect geometry can
>> >> be seen?
>> > That's probably fine. A sector count of zero is invalid for CHS. However,
>> > probably we should not even be using C/H/S at all if the device claims to
>> > support EDD. We already use raw LBAs if it supports EDD, and we should
>> > probably just ignore C/H/S altogether if it supports EDD.
>> This is all almost entirely outside my knowledge, but at the moment
>> bd_eddprobe() requres a geometry of 1023/255/63 before it attempts to
>> check if EDD can be used. Is that check incorrect?
> Well, it is very conservative in that it only uses EDD if it thinks it can't
> use C/H/S. It would be interesting to see if simply checking for a sector
> count of 0 there would avoid the divide-by-zero and let your device "work".
I'll need a few more pointers to try this.
As structured today, bd_io is what does the divide-by-zero, and it
always looks at od_sec to determine the maximum transfer size. It's
only later in bd_io that it checks the EDD mode to decide how to read.
* Play it safe and don't cross track boundaries.
* (XXX this is probably unnecessary)
sec = dblk % od->od_sec; /* offset into track */
x = min(od->od_sec - sec, resid);
if (maxfer > 0)
x = min(x, maxfer); /* fit bounce buffer */
However, I suppose this may be safe-ish to do as just:
x = resid;
if (maxfer > 0 && maxfer < resid)
x = maxfer; /* fit bounce buffer */
In which case the only uses of od_sec is now in bd_chs_io, and some printf's.
Though note that currently also the bd_eddprobe is only done in the
bd_open_mbr() step which is only if the GPT reads showed it isn't a
GPT device. I suppose that the EDD probe can be part of bd_getgeom,
I'll try this and see what happens on my device.
> However, it might actually be useful to always use EDD if possible, esp.
> EDD3 since that lets you not use bounce buffers down in 1MB.
>> In my specific case I know there's no bootable stuff on this disk; the
>> earlier layers bypassed it correctly without a problem.
> John Baldwin
More information about the freebsd-hackers