FYI: various 11.0-CURRENT -r293227 (and older) hangs on arm (rpi2): a description of sorts

Mark Millard markmi at dsl-only.net
Thu Jan 7 22:04:46 UTC 2016


On 2016-Jan-7, at 1:31 PM, Hans Petter Selasky <hps at selasky.org> wrote:
> 
> On 01/07/16 22:26, Hans Petter Selasky wrote:
>> On 01/07/16 21:20, Mark Millard wrote:
>>> 
>>> On 2016-Jan-7, at 12:04 PM, Hans Petter Selasky <hps at selasky.org>
>>> wrote:
>>>> 
>>>> On 01/07/16 20:48, Ian Lepore wrote:
>>>>> If the filesystems and swap space are on a usb drive, then maybe it's
>>>>> the usb subsystem that's hanging.  The wait states you showed for those
>>>>> processes are consistant with what I've seen when all buffers get
>>>>> backed up in a queue on one non-responsive or slow device.  It may be
>>>>> that there's a way to get the system deadlocked when it's low on
>>>>> buffers and there is memory pressure causing the swap to be used (I
>>>>> generally run arms systems without any swap configured).
>>>>> 
>>>>> Running gstat in another window while this is going on may give you
>>>>> some insight into the situation.  Beyond that I don't know what to look
>>>>> at, especially since you generally can't launch any new tools once the
>>>>> system gets into this kind of state.
>>>>> 
>>>>> -- Ian
>>>> 
>>>> Hi,
>>>> 
>>>> All USB transfers towards disk devices have timeouts, so if something
>>>> is hanging at USB level, you'll get a printout eventually.
>>> 
>>> What sort of timescale after deadlock/live-lock is observed to
>>> apparently have started does one have to wait in order to conclude
>>> that the timeouts would have happened and so they do not apply to the
>>> deadlock/live-lock?
>>> 
>>>> The USB kernel processes needed for doing I/O transfers are not
>>>> pinned to RAM. Can it happen if a USB process is swapped to disk,
>>>> that the system cannot wakeup a swapped out process to get more swap?
>>>> 
>>>> --HPS
>>> 
>> 
>> Hi,
>> 
>>> Wow. Could I use ddb to somehow check on the "USB kernel processes"
>>> swap status when the overall context is deadlocked/live-locked?
>> 
>> Are you able to run something like:
>> 
>> ps auxwwH | grep usb
>> 
>> > If yes, how? Otherwise something in top or some such display that I'd
>> left running over the serial console would have to present useful
>> information on the subject. Is there anything that would?
>> 
> 
> Are you able to SSH into the box or ping it?
> 
> --HPS

Once the live-lock condition is reached no new processes can be created as far as I can tell: the attempt will hang any process that attempts the creation.

I'd need "ps auxwwH" to be internally repeating to even get that much: I'd have to start it before the live-lock happened and it would have to be still running when the hang occurs, no on-going process creations involved.

I'm not so sure that two communicating processes (ps and grep over a pipe) would work but I can not get to even one new process so far.

ssh sessions also hang, input and output stop for them fairly generally. (Sometimes the context is such that ^t still works but shows no progress in what it reports.) No new ssh connections are possible: "Operation timed out".

ping does respond normally: it is more of a live-lock status then a true deadlock one overall.

The serial console still outputs what it was already running if that process does nothing that locks up. Changing what it is doing generally locks it up too.

Doing something like unplugging a usb keyboard or mouse or plugging one in does show the expected messages via the console: it is more of a live-lock status then a true deadlock one overall.

I can get to ddb after the hang. But I do not know what I'd do with it to find any useful information.


As noted in another message: I used gstat instead of top on the serial console:

> gstat shows everything zero during a hang, even L(q) column. (Length of queue?)
> 
> I used:
> 
> gstat -cod
> 
> and had it running over the serial console port during the attempted portmaster activity.


===
Mark Millard
markmi at dsl-only.net






More information about the freebsd-arm mailing list