Warner Losh imp at
Sun Jun 26 05:51:24 UTC 2011

On Jun 25, 2011, at 8:49 AM, Andriy Gapon wrote:
> Does anybody actually use kern.sync_on_panic tunable/sysctl?
> If yes, then in what circumstances do you need it?
> That is, why any other alternative doesn't work for you?
> Like:
> 1. remounting filesystems R/O before panic if you knowingly provoke it for testing
> 2. using netboot for your test system
> 3. using su+j, gjournal or a different filesystem altogether
> 4. using fsck after reboot
> It seems to me that syncing filesystems in panic context is an adventure.  And it
> may become even more of an adventure if we introduce code that completely stops
> scheduler in and after panic.

I've used it in the past when I was developing a device driver that was in the late stages of maturing.  Since all the panics in the system were when the driver dereferenced NULL in that driver, sync was safe because all the data structures were sane except the aforementioned driver.

(1) It was a production system, and everything that could be was already mounted r/w.  However, some small, but every critical, amount of data was still r/w and it was very important to not lose this data.  Production here likely should be in quotes, because it was in the late stages of testing/validation.  The problem was without this sometimes the saved state of the GPS receiver and other hardware would wind up being zero, which meant that we'd have to do a cold start which cost us a few hours of time.  At the time I was doing this, we saw zero files a couple times a day without this turned on.
(2) netbooting wasn't an option since we were qualifying a non-netbooting system.
(3) these weren't available at the time, but the goal was to prevent data loss, not to necessarily have to avoid fsck on boot.
(4) Data loss without it.

Now, I'll be the first to admit this has been a few years, and I haven't done a fresh evaluation to see if things are still safe.  I'll also be the first to admit that this was a useful debugging setting late in development, and not in production.  I'm also the first to admit this isn't what I'd call a very wide-spread case.  But it did come in very handy when chasing a few bugs to be able to do 10 panic/reboot cycles an hour rather than 2 a day.


More information about the freebsd-current mailing list