kern/163076: It is not possible to read in chunks from
linprocfs and procfs.
Dag-Erling Smørgrav
des at des.no
Sun Dec 11 17:30:06 UTC 2011
The following reply was made to PR kern/163076; it has been noted by GNATS.
From: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= <des at des.no>
To: "Poul-Henning Kamp" <phk at phk.freebsd.dk>
Cc: Jaakko Heinonen <jh at FreeBSD.org>, Petr Salinger <Petr.Salinger at seznam.cz>, bug-followup at FreeBSD.org, mdf at FreeBSD.org
Subject: Re: kern/163076: It is not possible to read in chunks from linprocfs and procfs.
Date: Sun, 11 Dec 2011 18:28:59 +0100
"Poul-Henning Kamp" <phk at phk.freebsd.dk> writes:
> Jaakko Heinonen <jh at FreeBSD.org> writes:
> > One problem is the different malloc() semantics. The kernel version uses
> > M_WAITOK allocations while user space malloc(3) can fail.
> Yes, that's Dag-Erlings and my point: The semantics are too different.
There is another, more important issue.
Synthetic (pseudofs-based) filesystems use sbufs extensively. If a
program seeks to a position far into a file in such a filesystem and
then reads some data from it, we have to allocate an sbuf large enough
to contain the entire file *up to and including* the data the program
actually asked for, unless the underlying filler function can somehow
skip the part the program didn't ask for (or a large part of it), which
is not always possible. This basically allows the userland to trick
pseudofs into allocating and dirtying arbitrary amounts of kernel
memory, which is obviously not a good thing.
The way sbuf_*printf() was originally implemented, it would have been
trivial to extend the API to have it allocate no more memory than
required to hold the amount of data the program requested, and discard
the first N bytes written to it. This is not possible with the current
code: the userland *printf() implementation is too different from the
kernel *printf() implementation, so they had to tear sbuf_*printf()
apart.
So what did we do instead? We added checks in pseudofs to prevent the
scenario I described. This effectively sets an arbitrary upper limit on
the size of sbuf-backed files in pseudofs-based filesystems. I was very
unhappy about it, but the only way around it would have been to hide
this functionality behind ifdefs so it was only available in the kernel,
and turn the sbuf implementation into an even bigger mess than it
already is - a mess which is entirely ascribed to the boneheaded
decision to use the same code in userland as in the kernel.
Before you point out that this *can* in fact be implemented using
sbuf_drain(): sbuf_drain() has no place in the sbuf API, and should be
removed. It breaks the sbuf semantics by a) providing access to the
contents of an sbuf at a time when the sbuf is, by definition, in an
inconsistent state, and b) emptying the sbuf at irregular intervals and
when sbuf_finish() is called, preventing reuse of its contents. It's
bad enough that I had to extend sbufs to support binary data; let's
please not compound that mistake by deviating even further from their
original purpose.
DES
--=20
Dag-Erling Sm=C3=B8rgrav - des at des.no
More information about the freebsd-bugs
mailing list