How to diagnose system freezes?

Dieter BSD dieterbsd at engineer.com
Mon Aug 27 20:04:40 UTC 2012


Yuri writes:

> Anything else I can try?
>
> One thing of importance here is that there is an older graphics card
> 9400 GT on this system and current nvidia-driver-295.71 has an issue
> with 9400 GT: it makes graphics to malfunction (unpainted windows, long
> delays switching to terminal mode) or freezes Xorg (but not OS). So I
> run the older nvidia-driver-285.05.09 which appears to work.
> That's why I think that nvidia driver is probably to blame for these
> periodic OS freezes. Also the latest driver version must be, obviously,
> working for most people because (I think) they mostly have newer than
> mine nvidia cards.

Have you found a way to trigger the bug on demand?

Since you suspect the nvidia-driver-285.05.09, try some other
driver, and do whatever triggers the bug and see if you get the freeze.

> So maybe I should also just get the newer nvidia card
> and shut up, not sure.

If you can demonstrate that the various nvidia drivers are broken
in various ways, submit a problem report to whoever wrote the drivers
(Nvidia presumably). If Nvidia supports their products, then they
will fix their drivers. If they don't support their products,
why would you want to reward them by buying another nvidia card?

But the question I have is: why are device drivers allowed to
freeze the entire machine?

I have at *least* 4 drivers that freeze the machine long enough for
data to be lost. My theory is that:

1) An interrupt comes in.
2) ALL INTERRUPTS ARE BLOCKED !
3) The device driver sits around too long.
4) Eventually the interrupts are turned back on.

If the device driver gets stuck in an infinite loop, the machine
hangs forever.

Assuming my theory is correct (anyone disagree?), then WHY are ALL
interrupts blocked? Why can't we just block interrupts for that
particular device?


More information about the freebsd-hackers mailing list