ATA large disks & EDD at boot
Valentin Nechayev
netch at ivb.nn.kiev.ua
Sun Mar 7 06:07:44 PST 2004
Hi,
it seems that current traditional or new (packet, aka EDD, aka Int13x)
disk read interface selection in boot blocks (boot1 & mbr) is obsolete
and leads to unbootable systems.
The main factor to kill old access is strange BIOS translation for disks
larger than 32G. For two my home disks:
Model: IC35L040AVER07-0
LBA size: 80418240 blocks
BIOS geometry in "LBA" mode: 5005*255*63
BIOS geometry in "NORMAL" mode: 19710*16*255
BIOS geometry in "LARGE" mode: 1314*240*255
Model: SAMSUNG SP1203N
LBA size: 234493056 blocks
BIOS geometry in "LBA" mode: 14596*255*63
BIOS geometry in "NORMAL" mode: 57473*16*255
BIOS geometry in "LARGE" mode: 3831*240*255
It may firstly seem that it's specific for one BIOS version, but I've checked
reported translations on a bunch of another motherboards and another
BIOS'es and saw the identical translation.
The sabotage factor here is 255 sectors. It can't fit in 6 bits which
are allowed in B-1302 interface (which is used in all boots when EDD
is considered to be non-nessessary) and isn't correctly reported by
B-1308 interface (i.e. int 0x13 with AH=0x08) which gets drive parameters.
I checked work of this interface under MS-DOS using program which prints
reported drive parameters and then compares checksum of block got using
old (CHS) interface with checksums of blocks got using LBA with idea
of 63 or 255 sectors. BIOS translation was set to "NORMAL" (and BIOS said
that geometry is 19710*16*255).
The testing code in question:
[...]
for( head = 0; head <= 15 && head < nhead ; ++head ) {
for( sect = 1; sect <= 63 && sect <= nsect; ++sect ) {
unsigned sum1, sum2, sum3;
sum1 = sum_chs( cyl, head, sect );
sum2 = sum_lba( (unsigned long) cyl*nhead*63 +
(unsigned long) head*63 + (sect-1) );
sum3 = sum_lba( (unsigned long) cyl*nhead*255 +
(unsigned long) head*255 + (sect-1) );
printf( "%u:%u:%u -> 0x%04X 0x%04X 0x%04X\n",
cyl, head, sect, sum1, sum2, sum3 );
[...]
(sum_xxx() functions read the specified block and return its checksum.)
It reports:
B-1308: geometry 1024*16*63
2:0:1 -> 0x28D0 0x0646 0x28D0
2:0:2 -> 0xCE22 0x0128 0xCE22
2:0:3 -> 0xEBBD 0x3DD5 0xEBBD
2:0:4 -> 0x53A6 0xADBC 0x53A6
2:0:5 -> 0x04D3 0xD57B 0x04D3
2:0:6 -> 0xB0E6 0xD842 0xB0E6
2:0:7 -> 0xCE3B 0xA763 0xCE3B
2:0:8 -> 0xB988 0xE4F2 0xB988
2:0:9 -> 0x7370 0x3B86 0x7370
[... and so on...]
So, sum1 == sum3 whatever head & sect are tested, and the same is for all
tested block range; it means that BIOS really thinks for 255 sectors
on such logical track. But as seen above, B-1308 reports only 63 sectors :(
Result is translation error and incorrect data for any block with absolute
number >= 63.
/boot/boot1 uses B-1308 BIOS call to determine whether EDD is needed
(boot1.s, read() function):
read: push %dx // Save
movb $0x8,%ah // BIOS: Get drive
int $0x13 // parameters
[... divide block number to got head and sector count... ]
pop %dx // Restore
cmpl $0x3ff,%eax // Cylinder number supportable?
sti // Enable interrupts
ja read.7 // No, try EDD
So, it can't boot system on disk >32G and "NORMAL" or "LARGE" BIOS translation.
This is proved practically, setting system with this mode and seeing total
boot failing :(
In contrary to "NORMAL" and "LARGE" modes, "LBA" seems to be free from
such problems: it's standardized with 63 sectors. See, e.g.,
http://www.firmware.com/support/bios/over4gb.htm. So, with LBA translation
in BIOS, /boot/boot1 can boot system (unless 128G is crossed); with another
translation, it's possible only for disks less than 32G.
But /boot/mbr can be faulty even for disks less than 32G, because it decides
to use EDD calls basing on too strange factors:
movb 0x1(%si),%dh # Load head
movw 0x2(%si),%cx # Load cylinder:sector
movw $LOAD,%bx # Transfer buffer
cmpb $0xff,%dh # Might we need to use LBA?
jnz main.7 # No.
cmpw $0xffff,%cx # Do we need to use LBA?
jnz main.7 # No.
So, EDD calls are used only seeing 1023:255:xx as begin addess of slice.
I don't know partition (slice) editor which sets such values. E.g., linux
fdisk and seen MS Windows fdisks wrote 1023:254:63 when partition begin
or end isn't fit in 1024 cylinders in current geometry; boot1 default PT
(used in dedicated mode) also has now 254 heads, not 255. This MBR will
fail on such blocks.
/boot/boot0 is free of such problems because it has packet mode (i.e.
EDD calls) enabled by default.
To compare with other boot loaders, Linux LILO & GRUB now use EDD by default
and doesn't try to fall to CHS for disk beginning.
My proposition to solve these problems is to replicate /boot/boot0 approach:
use externally configured flag whether to use packet mode (EDD calls),
and set default value of this flag to true; EDD is used when
is available and returns correct data, otherwise it switches to CHS access.
Patches below are tested on my home system.
It's interesting that comment to read() in boot1.s says that it uses
the same logic, but in real logic EDD usage is reduced to corner case.
After the following patch, code and comment will be in consent.
I've checked the full CVS history for these boot blocks, but none of them
shown real reason of any change in read function selection logic. Also
I couldn't find any essential information in mailing archives.
So, this post is based only on latest code state and own experiments.
==={{{
--- boot1.s.orig Sun Mar 7 11:18:42 2004
+++ boot1.s Sun Mar 7 11:24:04 2004
@@ -265,7 +265,9 @@
// %dl - byte - drive number
// stack - 10 bytes - EDD Packet
//
-read: push %dx // Save
+read: testb $FL_PACKET,%cs:MEM_REL+flags-start // LBA support enabled?
+ jnz read.7 // Yes, go to LBA code
+read.1: push %dx // Save
movb $0x8,%ah // BIOS: Get drive
int $0x13 // parameters
movb %dh,%ch // Max head number
@@ -288,7 +290,7 @@
pop %dx // Restore
cmpl $0x3ff,%eax // Cylinder number supportable?
sti // Enable interrupts
- ja read.7 // No, try EDD
+ ja ereturn // No, stop attempts
xchgb %al,%ah // Set up cylinder
rorb $0x2,%al // number
orb %ch,%al // Merge
@@ -326,21 +328,20 @@
sub %al,0x2(%bp) // block count
ja read // If not done
read.6: retw // To caller
-read.7: testb $FL_PACKET,%cs:MEM_REL+flags-start // LBA support enabled?
- jz ereturn // No, so return an error
- mov $0x55aa,%bx // Magic
+read.7: mov $0x55aa,%bx // Magic
push %dx // Save
movb $0x41,%ah // BIOS: Check
int $0x13 // extensions present
pop %dx // Restore
- jc return // If error, return an error
+ jc read.1 // if no, go to CHS read
cmp $0xaa55,%bx // Magic?
- jne ereturn // No, so return an error
+ jne read.1 // if no, go to CHS read
testb $0x1,%cl // Packet interface?
- jz ereturn // No, so return an error
+ jz read.1 // if no, go to CHS read
mov %bp,%si // Disk packet
movb $0x42,%ah // BIOS: Extended
int $0x13 // read
+ jc read.1 // last resort attempt
retw // To caller
// Messages
===}}}
MBR patch is similar, but requires adding flag byte and reflection in makefile:
==={{{
--- mbr.s.orig Sun Mar 7 11:31:12 2004
+++ mbr.s Sun Mar 7 11:43:33 2004
@@ -88,10 +88,8 @@
movb 0x1(%si),%dh # Load head
movw 0x2(%si),%cx # Load cylinder:sector
movw $LOAD,%bx # Transfer buffer
- cmpb $0xff,%dh # Might we need to use LBA?
- jnz main.7 # No.
- cmpw $0xffff,%cx # Do we need to use LBA?
- jnz main.7 # No.
+ testb $0x1,%cs:flags+EXEC-start # EDD is allowed?
+ jz main.6 # No.
pushw %cx # Save %cx
pushw %bx # Save %bx
movw $0x55aa,%bx # Magic
@@ -150,6 +148,7 @@
msg_pt: .asciz "Invalid partition table"
msg_rd: .asciz "Error loading operating system"
msg_os: .asciz "Missing operating system"
+flags: .byte MBRFLAGS
.org PT_OFF
--- Makefile.orig Sun Mar 7 11:37:25 2004
+++ Makefile Sun Mar 7 11:40:09 2004
@@ -6,6 +6,10 @@
BINDIR?= /boot
BINMODE= 444
+MBRFLAGS?= 0x01
+
+AFLAGS += --defsym MBRFLAGS=${MBRFLAGS}
+
ORG= 0x600
mbr: mbr.o
===}}}
All patches were based on 5.2-release code.
-netch-
More information about the freebsd-hackers
mailing list