Current gptzfsboot limitations

John Baldwin jhb at freebsd.org
Tue Nov 24 18:18:25 UTC 2009


On Monday 23 November 2009 5:04:30 pm Matt Reimer wrote:
> On Mon, Nov 23, 2009 at 7:18 AM, John Baldwin <jhb at freebsd.org> wrote:
> > On Friday 20 November 2009 7:46:54 pm Matt Reimer wrote:
> >> I've been analyzing gptzfsboot to see what its limitations are. I
> >> think it should now work fine for a healthy pool with any number of
> >> disks, with any type of vdev, whether single disk, stripe, mirror,
> >> raidz or raidz2.
> >>
> >> But there are currently several limitations (likely in loader.zfs
> >> too), mostly due to the limited amount of memory available (< 640KB)
> >> and the simple memory allocators used (a simple malloc() and
> >> zfs_alloc_temp()).
> ...
> >>
> >> I think I've also hit a stack overflow a couple of times while debugging.
> >>
> >> I don't know enough about the gptzfsboot/loader.zfs environment to
> >> know whether the heap size could be easily enlarged, or whether there
> >> is room for a real malloc() with free(). loader(8) seems to use the
> >> malloc() in libstand. Can anyone shed some light on the memory
> >> limitations and possible solutions?
> >>
> >> I won't be able to spend much more time on this, but I wanted to pass
> >> on what I've learned in case someone else has the time and boot fu to
> >> take it the next step.
> >
> > One issue is that disk transfers need to happen in the lower 1MB due to BIOS
> > limitations.  The loader uses a bounce buffer (in biosdisk.c in libi386) to
> > make this work ok.  The loader uses memory > 1MB for malloc().  You could
> > probably change zfsboot to do that as well if not already.  Just note that
> > drvread() has to bounce buffer requests in that case.  The text + data + bss
> > + stack is all in the lower 640k and there's not much you can do about that.
> > The stack grows down from 640k, and the boot program text + data starts at
> > 64k with the bss following.
> 
> Ah, the stack growing down from 640k explains a problem I was seeing
> where a memcpy() to a temp buf would restart gptzfsboot--it must have
> been overwriting the stack.
> 
> > Hmm, drvread() might already be bounce buffering
> > since boot2 has to do so since it copies the loader up to memory > 1MB as
> > well.
> 
> Looks like it's already bounce buffering. All the I/O drvread does is
> to statically allocated char arrays, and the data is copied when
> necessary, e.g. in vdev_read():
> 
>                 if (drvread(dsk, dmadat->rdbuf, lba, nb))
>                         return -1;
>                 memcpy(p, dmadat->rdbuf, nb * DEV_BSIZE);
> 
> 
> > You might need to use memory > 2MB for zfsboot's malloc() so that the
> > loader can be copied up to 1MB.  It looks like you could patch malloc() in
> > zfsboot.c to use 4*1024*1024 as heap_next and maybe 64*1024*1024 as heap_end
> > (this assumes all machines that boot ZFS have at least 64MB of RAM, which is
> > probably safe).
> 
> So are the page tables etc. already configured such that RAM above 1MB
> is ready to use in gptzfsboot? (I'm not familiar with the details of
> how virtual memory is handled on i386.)
> 
> Thanks for your help John.

Paging is not enabled in the boot loader.  Instead, the loader runs in a 32-bit
flat mode (but with an offset of 0xa000).  Simply changing the constants for
heap_start and heap_end should be sufficient.

-- 
John Baldwin


More information about the freebsd-fs mailing list