kern/163076: It is not possible to read in chunks from linprocfs
and procfs.
Bruce Evans
brde at optusnet.com.au
Mon Dec 5 15:07:05 UTC 2011
On Mon, 5 Dec 2011, Petr Salinger wrote:
>> Description:
> It is not possible to read in chunks from linprocfs and procfs.
> It is a regression against stable-8.
> I suspect it is due to changes of sbuf implementation between 8 and 9.
>
> Some files are rather big (over 4KB) and it is really standard to read them in blocks.
>> How-To-Repeat:
> "dd if=$FILE bs=1", with FILE any file in procfs or linprocfs
> The result is empty output.
I don't remember this ever working. The correct way to fix it is
unclear (start by not claiming that the highly irregular files in
procfs are regular), but empty output is unnecessarily bad - I
would expect to get at least 1 byte. Under FreeBSD-~5.2, I get
the following file sizes:
file dd (1 byte) dd (10k) dd (1m) wc | cut... wc -c stat
-------- ----------- -------- ------- -------- -------- --------
cmdline 0 6 EIO 6 0 0
ctl EBADF EBADF EBADF EBADF ctl 0
dbregs hangs hangs hangs hangs 0 0
etype 0 14 EIO 14 0 0
file@ 575712 575712 575712 575712 575712 575712
fpregs hangs hangs hangs hangs 0 0
map 0 1150 EIO 1150 0 0
mem EBADF EBADF EBADF EBADF 0 0
note EBADF EBADF EBADF EBADF 0 0
notepg EBADF EBADF EBADF EBADF 0 0
regs hangs hangs hangs hangs 0 0
rlimit 0 65 EIO 65 0 0
status 0 94 EIO 94 0 0
The irregularity is so large that it confuses wc -c into not working,
while plain wc works. This is apparently because wc -c believes the
claim that the file is regular, so it stats the file to get its size
and finds 0, while plain wc reads the whole file using block size 64K.
(md5 is another utility that is broken on such files, but it
is broken even for files that don't claim to be regular. E.g.,
md5 on /dev/zero (or any device file that you can open) gives
the same result as md5 on /dev/null, because it just stats the
file, although this is completely wrong for device files. md5
is unbroken on pipes, so you can apply it to device files using
the apparent beginner's pessimization "cat /dev/foo | md5".
This method works for the irregular regular files in procfs
too. You would have to use dd instead of cat to control the
block size, and choose a size that is large enough to work and
small enough to avoid EIO.)
The *regs files don't block doing the read(), but just loop endlessly
trying to read an infinite amount. This is because the uio offset is
reset to 0 after each read. ISTR this being done for some other file
types. This is a different feeble attempt to fix the problem in this
PR. The basic problem is that seeking is not implemented for many
files, so there is no way to continue reading from the previous uio
offset, so the new offset must be either infinity (for most files) or
0 (for regs files).
I can now explain more of the above irregularities:
- for tiny files, seeking is easy to implement by sprintf()ing the
whole file and using an offset in the string. The string constant
should be either invariant or the previously generated string must
be saved across reads (saving the string is only reasonable if it
is tiny). This (except possibly for sufficient invariance/saving)
is done. But some bug breaks reads of size 1. Perhaps this is fixed
in -current, or was fixed and has been broken again. dd seems to
work with block sizes betwen 2 and 128k inclusive in cases where it
works with a block size of 10k in the above. The 128k limit would
be explained by the misimplementation of attempting to malloc() the
user-specified read size instead of the tiny size actually needed.
The user must not be allowed to malloc() large sizes and there is
an arbitrary limit of 128k.
- the regs files are small although not tiny. But they are highly
variable so they should be read atomically using read() syscalls.
Thus seeking in them is not useful. This should probably by enforced
by only allowing the uio offset to be 0 or EOF. Instead, it is only
partially enforced by resetting the offset to 0 after each read (I
hink applications can mess this up by lseek()ing between reads), So
callers don't need to do an lseek() for this. This API was invented
before pread() existed. pread() should be used now. This API results
in casual observers reading the same data endlessly. I sometimes
look at these files using hd and would prefer that EOF worked normally
for them.
Bruce
More information about the freebsd-bugs
mailing list