stopping amd causes a freeze

Sun Jul 28 08:25:44 UTC 2013

On 28/07/2013 08:24, Konstantin Belousov wrote:
> On Sat, Jul 27, 2013 at 10:33:18AM +0200, Dominic Fandrey wrote:
>> On 26/07/2013 19:10, Dominic Fandrey wrote:
>>> On 25/07/2013 12:00, Konstantin Belousov wrote:
>>>> On Thu, Jul 25, 2013 at 09:56:59AM +0200, Dominic Fandrey wrote:
>>>>> On 22/07/2013 12:07, Konstantin Belousov wrote:
>>>>>> On Mon, Jul 22, 2013 at 11:50:24AM +0200, Dominic Fandrey wrote:
>>>>>>> ...
>>>>>>>
>>>>>>> I run amd through sysutils/automounter, which is a scripting solution
>>>>>>> that generates an amd.map file based on encountered devices and devd
>>>>>>> events. The SIGHUP it sends to amd to tell it the map file was updated
>>>>>>> does not cause problems, only a -SIGKILL- SIGTERM may cause the freeze.
>>>>>>>
>>>>>>> Nothing was mounted (by amd) during the last freeze.
>>>>>>>
>>>>>>> ...
>>>>>>
>>>>>> Are you sure that the machine did not paniced ?  Do you have serial console ?
>>>>>>
>>>>>> The amd(8) locks itself into memory, most likely due to the fear of
>>>>>> deadlock. There are some known issues with user wirings in stable/9.
>>>>>> If the problem you see is indeed due to wiring, you might try to apply
>>>>>> r253187-r253191.
>>>>>
>>>>> I tried that. Applying the diff was straightforward enough. But the
>>>>> resulting kernel paniced as soon as it tried to mount the root fs.
>>>> You did provided a useful info to diagnose the issue.
>>>>
>>>> Patch should keep KBI compatible, but, just in case, if you have any
>>>> third-party module, rebuild it.
>>>>
>>>>>
>>>>> So I'll wait for the MFC from someone who knows what he/she is doing.
>>>>
>>>> Patch below booted for me, and I run some sanity check tests for the
>>>> mlockall(2), which also did not resulted in misbehaviour.
>>>>
>>>
>>> Your patch applied cleanly and the system booted with the resulting
>>> kernel.
>>>
>>> Amd exhibits several very strange behaviours. ...
>>
>> I can verify the whole thing with a clean world and kernel.
>>
>> This time I'll concentrate on the first instance of amd:
>>
>> # tail -n3 /var/log/messages
>> Jul 27 10:08:56 mobileKamikaze kernel: newnfs server pid5868 at mobileKamikaze:/var/run/automounter.amd.mnt: not responding
>> Jul 27 10:09:41 mobileKamikaze kernel: newnfs server pid5868 at mobileKamikaze:/var/run/automounter.amd.mnt: not responding
>> Jul 27 10:11:41 mobileKamikaze last message repeated 3 times
>>
>> The process, it turns out, simply doesn't exist. There is another
>> process, though:
>> # ps auxww | grep -F sbin/amd
>> root       5869   0.0  0.1  12036   8020 ??  S    10:08am   0:00.01 /usr/sbin/amd -r -p -a /var/run/automounter.amd -c 4 -w 2 /var/run/automounter.amd.mnt /var/run/automounter.amd.map
>>
>> # cat /var/run/automounter.amd.pid
>> 5868
>>
>> Here is what I think happens, amd forks a subprocess and the main
>> process, silently dies after it wrote its pidfile.
> Nothing dies silently.  Either process was killed by signal, or it
> exited with the explicit call to exit(2).  In the first case, default
> kernel settings of kern.logsigexit should make a record in the syslog.
> The machdep.uprintf_signal might be also useful, but not for daemons.

Well, after I reverted your patch I got some things in the syslog.
Sometimes amd works as expected, sometimes it dies right after starting:
Jul 28 10:19:42 mobileKamikaze kernel: pid 24217 (amd), uid 0: exited on signal 11 (core dumped)

This is just all over confusing.

-- 
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?