remaining FreeBSD 4.11-RC3 bugs

Matthias Andree matthias.andree at gmx.de
Tue Jan 18 03:35:19 PST 2005


Ken Smith <kensmith at cse.Buffalo.EDU> writes:

> On Mon, Jan 17, 2005 at 10:28:53AM +0100, Matthias Andree wrote:
>
>> critical:
>> kern/60313 (silent data corruption on block devices)
>>            still open, may be FreeBSD 4 specific after the GEOM flurry
>>            in FreeBSD 5 and 6.
>
> I took a look at that one.  The patch provided doesn't map exactly into
> RELENG_4 but I gave this a try:

Some (similar) attempts are in the mailing list archives, but none ended
as a complete fix and stopped halfway for various reasons, usually lack
of time or interest. I know too little about kernel internals as to be
of any help here, besides testing.

> With that in place the kernel won't boot because the stuff dealing with
> disk labels in at least two places on my machine didn't set the offset
> properly in the buf structures it used.  Once I fixed that the machine
> booted but *some* executables wouldn't start because spec_getpages()
> was doing reads with an offset of 256 (DEV_BSIZE is 512).

These some executables may well have been reading the wrong data
since the block devices have become unbuffered... if unaligned, the seek
offset is silently rounded down to the previous block margin.

> I'll see if I can follow up on this and eventually get RELENG_4 fixed
> (and perhaps even an Errata) but there is no way I'd be comfortable
> with a fix for this going into 4.11 unless an expert worked on it
> and we did an RC4.

I can understand that.

>> serious:
>> bin/71453 (tcpdump ipv6 crash, trivial fix -- MFC sufficient) still open
>
> It looks like the right way to fix this is with a fresh vendor import
> if I'm understanding things correctly.  Again something that would
> require more time than we have and might be best handled as an Errata
> item after the release.

If the FreeBSD 4.11-RC tcpdump is essentially the same as the FreeBSD
tcpdump around 5.3, the import of a single fix may be sufficient.

It's only that the tcpdump executable does The Wrong Thing when
encountering PPP packets to negotiate IPv6 - it faults, rather than
reporting an unknown packet and moving on. The patch quoted in the PR
fixes only that bogus abort.

>> bin/46866 (false data from getpwent, easy to fix) still open
>
> There has been ongoing disagreements about how best to handle this one,
> as far as I can tell the disagreements are still ongoing.

If FreeBSD prefers non-deterministic results with all non-repeatable and
hard to debug consequences such as mail bouncing, users denied login and
so on, well.

The only reason brought forth in support of the current behavior was the
measly "we've always done it this way" and perhaps "we don't know what
will break if". Neither is a valid justification to keep the bug in.

Note this change need only happen for functions with insufficient
interfaces, i. e. those that cannot report a difference between
temporary and permanent error, such as getpwent(), and AFAICT the
lookups happen in user-space, so SIGINT would work.

-- 
Matthias Andree


More information about the freebsd-stable mailing list