kern/87255: Large malloc-backed mfs crashes the system
Yar Tikhiy
yar at comp.chem.msu.su
Thu Jul 20 14:52:39 UTC 2006
On Wed, Jul 05, 2006 at 12:16:11PM +0100, Robert Watson wrote:
> On Wed, 26 Oct 2005, Yar Tikhiy wrote:
>
> >> In all cases it is a "don't do that then" class of problem.
> >
> >Yes, of course. The question is whether we consider it normal for root to
> >have ability to panic the system using standard tools. "cat /dev/zero >
> >/dev/mem" still is the ultimate way to. IMHO it is a key issue whether we
> >fall back at the academical/research stage where rough corners are OK and
> >the system is just a toy for eggheads, or we pretend our system is stable
> >and robust. I doubt if an admin can crash the Windows NT kernel from the
> >userland using conventional interfaces. I by no means expect this issue
> >to be resolved soon, but it's worth being reflected on at tea-time :-)
> >
> >Apropos, here's another reproducible crash induced by md:
> >
> > # mdconfig -a -t malloc -s 300m
> > md0
> > # dd if=/dev/urandom of=/dev/md0 bs=1
> > dd: /dev/md0: Input/output error
> > 79+0 records in
> > 78+9 records out
> > # reboot
> > panic: kmem_malloc(4096): kmem_map too small: 86224896 total
> > allocated
> >
> >Apparently, it is not a fault of md, just our kernel memory allocator
> >allows other kernel parts to starve it to death.
>
> I'm not sure I entirely go along with this interpretation. The answer to
> the question "What do do when the kernel runs out of address space?" is not
> easily found. The "problem" is that md performs potentially unbounded
> allocation of a quite bounded resource -- remember that resource deadlocks
> are very real, sometimes it takes memory to release memory (abstractly,
> think of memory allocation as locking). UMA supports allocator-enforced
> resource limits, which can be requested by the consumer using
> uma_zone_set_max(). md(4) should probably be using that interface and
> requesting a resource limit.
The panic doesn't seem to be on a critical path in the kernel; it's
in kmem_malloc(), which is essentially a utility routine. Could
the allocation attempt just fail for the caller to decide what to
do then? In fact, it can fail, but only in case of M_NOWAIT:
if (vm_map_findspace(map, vm_map_min(map), size, &addr)) {
vm_map_unlock(map);
if ((flags & M_NOWAIT) == 0)
panic("kmem_malloc(%ld): kmem_map too small: %ld total allocated",
(long)size, (long)map->size);
return (0);
}
Looks like we have to panic there merely because malloc(9) is
promised to succeed if waiting is OK, but there's no chance for
success. Isn't it a design issue?
> There is also a problem then regarding what happens when md(4) runs out of
> resources to allocate when it has already "promised" that it's a disk of a
> certain size up the stack. I.e., if the result isn't a panic, then how
> will md(4) handle failure? Most file systems will not be happy when they
> get EIO, so then perhaps the problem is that md(4) provides an abstraction
> for a non-sparse device up the storage stack, but is in fact
> over-committing. This suggests either that the size of an md device should
> be strictly bounded if it is malloc-backed. Picking that maximum bound is
> also tricky. This is why, in practice, we recommend using swap-backed md
> devices, so that the pages associated with the md device can be swapped out
> under memory pressure, and that the swap system have enough memory to fully
> back the md device.
Perhaps md(4) shouldn't over-commit in malloc mode? It will waste
precious physical memory, but malloc mode is supposed to. And one
can't use swap-backed md when diskless.
--
Yar
More information about the freebsd-bugs
mailing list