Current gptzfsboot limitations
John Baldwin
jhb at freebsd.org
Tue Nov 24 18:18:25 UTC 2009
On Monday 23 November 2009 5:04:30 pm Matt Reimer wrote:
> On Mon, Nov 23, 2009 at 7:18 AM, John Baldwin <jhb at freebsd.org> wrote:
> > On Friday 20 November 2009 7:46:54 pm Matt Reimer wrote:
> >> I've been analyzing gptzfsboot to see what its limitations are. I
> >> think it should now work fine for a healthy pool with any number of
> >> disks, with any type of vdev, whether single disk, stripe, mirror,
> >> raidz or raidz2.
> >>
> >> But there are currently several limitations (likely in loader.zfs
> >> too), mostly due to the limited amount of memory available (< 640KB)
> >> and the simple memory allocators used (a simple malloc() and
> >> zfs_alloc_temp()).
> ...
> >>
> >> I think I've also hit a stack overflow a couple of times while debugging.
> >>
> >> I don't know enough about the gptzfsboot/loader.zfs environment to
> >> know whether the heap size could be easily enlarged, or whether there
> >> is room for a real malloc() with free(). loader(8) seems to use the
> >> malloc() in libstand. Can anyone shed some light on the memory
> >> limitations and possible solutions?
> >>
> >> I won't be able to spend much more time on this, but I wanted to pass
> >> on what I've learned in case someone else has the time and boot fu to
> >> take it the next step.
> >
> > One issue is that disk transfers need to happen in the lower 1MB due to BIOS
> > limitations. The loader uses a bounce buffer (in biosdisk.c in libi386) to
> > make this work ok. The loader uses memory > 1MB for malloc(). You could
> > probably change zfsboot to do that as well if not already. Just note that
> > drvread() has to bounce buffer requests in that case. The text + data + bss
> > + stack is all in the lower 640k and there's not much you can do about that.
> > The stack grows down from 640k, and the boot program text + data starts at
> > 64k with the bss following.
>
> Ah, the stack growing down from 640k explains a problem I was seeing
> where a memcpy() to a temp buf would restart gptzfsboot--it must have
> been overwriting the stack.
>
> > Hmm, drvread() might already be bounce buffering
> > since boot2 has to do so since it copies the loader up to memory > 1MB as
> > well.
>
> Looks like it's already bounce buffering. All the I/O drvread does is
> to statically allocated char arrays, and the data is copied when
> necessary, e.g. in vdev_read():
>
> if (drvread(dsk, dmadat->rdbuf, lba, nb))
> return -1;
> memcpy(p, dmadat->rdbuf, nb * DEV_BSIZE);
>
>
> > You might need to use memory > 2MB for zfsboot's malloc() so that the
> > loader can be copied up to 1MB. It looks like you could patch malloc() in
> > zfsboot.c to use 4*1024*1024 as heap_next and maybe 64*1024*1024 as heap_end
> > (this assumes all machines that boot ZFS have at least 64MB of RAM, which is
> > probably safe).
>
> So are the page tables etc. already configured such that RAM above 1MB
> is ready to use in gptzfsboot? (I'm not familiar with the details of
> how virtual memory is handled on i386.)
>
> Thanks for your help John.
Paging is not enabled in the boot loader. Instead, the loader runs in a 32-bit
flat mode (but with an offset of 0xa000). Simply changing the constants for
heap_start and heap_end should be sufficient.
--
John Baldwin
More information about the freebsd-fs
mailing list