Serious issue with serial console in 5.4

Eirik Øverby ltning at anduin.net
Thu Jul 21 11:02:33 GMT 2005


On Jul 21, 2005, at 12:16 PM, Robert Watson wrote:

>
> On Thu, 21 Jul 2005, Eirik Øverby wrote:
>
>
>>>> The above panic will show up occasionally when logging out from a
>>>> serial console (i.e. ctrl-D, logout, exit, whatever). This is
>>>> EXTREMELY BAD, as it will crash an otherwise perfectly healthy  
>>>> box at
>>>> random - and renders the serial console useless.
>>>> Robert Watson confirmed this to be an issue on the 10th of April.
>>>>
>>> You might have to wait until 6.0-R since fixing it seems to  
>>> require infrastructure changes that cannot easily be backported  
>>> to 5.x.
>>>
>>
>> With all due respect - if this is (and I'm assuming it is, because  
>> it happens on all the servers I'm serial-controlling) an  
>> omnipresent problem on 5.x, I daresay it should warrant some more  
>> attention. Having unsafe serial terminal support that can bring  
>> down your system like that defies much of the point of having  
>> serial terminal support in the first place.
>>
>> However, since I seem to be the only one who has noticed this,  
>> perhaps I'm the last person on earth to routinely use serial  
>> terminal switches instead of KVM switches to do my admin work?
>>
>
> The concern about the 5.x backport is that it will break parts of  
> the device driver ABI, and is a significant change that involves a  
> lot of risk.
>
> Regarding the general prevalence of the problem -- I've seen a  
> small number of people reporting it's a big problem.  Since I know  
> of a great many people running with serial consoles (other than a  
> workstation, I never run FreeBSD boxes any other way), this leads  
> me to believe it's something that shows up in fairly specific  
> conditions -- perhaps relating to precise timing of a race  
> condition.  This means that if we introduce a generally  
> destabilizing change, it may impact more people than the problem as  
> it exists (a nasty trade-off).
>
> I've only seen the issue when logging out of a serial console  
> session, and had previously hypothesized that it had to do with the  
> simultaneous timing of a console message from syslog and the  
> opening/closing of the console's tty due to logging out and getty  
> restarting, resulting in a reference count improperly hitting zero.

I did indeed make some changes to my syslog configuration after  
getting the serials online. Your theory might not be entirely off.
Let me know if I should post my syslog.conf file or anything else  
here or elsewhere...

Thanks,
/Eirik


> I thought Doug White had come up with a work-around patch that  
> prevented the reference count from being allowed to hit 0 for the  
> console by artificially elevating it, which would prevent the  
> panic, so either (a) the work around wasn't committed, or (b) it  
> didn't work.
>
> I can attempt to take another look at this problem in a week or so,  
> but have a number of things I need to finish up for FreeBSD 6.0  
> before then that will be occupying my time.
>
> Robert N M Watson



More information about the freebsd-stable mailing list