Does sync(8) really flush everything? Lost writes with journaled SU after sync+power cycle

Kevin Day toasty at dragondata.com
Wed Apr 10 20:03:01 UTC 2013


Working with an environment where a system (with journaled soft-updates) is going to be notified that it's going to be losing power shortly, and needs to shut down daemons and flush everything to disk. It doesn't actually shutdown though, because the "power down now" command may get cancelled and we need to bring things back up. My understanding was that we could call sync(8), then just wait for the power to drop.

The problem is that we were frequently losing the last 30-60 seconds worth of filesystem changes prior to the shutdown. i.e. newly created directories would disappear or fsck would reclaim them and throw them into lost+found.

I confirmed that there is no caching disk controller, and write caching is disabled on the drives themselves, and the problem continued.

On a whim, after running sync(8) once and waiting 10 seconds, I did "mount -u -o ro -f /" to force the filesystem into read-only mode. It took about 8 seconds to finish, gstat showed a lot of write activity, and SIGINFO on the mount command showed:

load: 0.01  cmd: mount 15775 [biowr] 3.62r 0.00u 0.55s 5% 1644k
load: 0.03  cmd: mount 15775 [runnable] 4.41r 0.00u 0.65s 6% 1644k
load: 0.03  cmd: mount 15775 [biowr] 5.00r 0.00u 0.72s 6% 1644k
load: 0.03  cmd: mount 15775 [biowr] 5.70r 0.00u 0.80s 6% 1644k
load: 0.03  cmd: mount 15775 [biowr] 6.03r 0.00u 0.84s 6% 1644k
load: 0.03  cmd: mount 15775 [running] 6.27r 0.00u 0.87s 6% 1644k
load: 0.03  cmd: mount 15775 [biowr] 6.51r 0.00u 0.90s 7% 1644k
load: 0.03  cmd: mount 15775 [biowr] 6.69r 0.00u 0.92s 6% 1644k
load: 0.03  cmd: mount 15775 [biowr] 6.90r 0.00u 0.94s 6% 1644k
load: 0.03  cmd: mount 15775 [biowr] 7.04r 0.00u 0.96s 7% 1644k
load: 0.03  cmd: mount 15775 [biowr] 7.20r 0.00u 0.98s 7% 1644k

If sync's man page is true (force completion of pending disk writes (flush cache)), and there is zero filesystem activity occurring, shouldn't that be enough to ensure no corruption after a power cycle? If sync really is flushing everything, what's all the write activity happening in when degrading from rw to ro?

Is there a better way to get things into a stable state on disk, yet not fully shutdown so that we can recover from this if the shutdown order is cancelled?


For me, this is easily reproducible with:

mkdir /root/test
sync
sleep 10
(hit reset button)

The problem doesn't happen with:

mkdir /root/test
mount -u -o ro -f /
(hit reset button)


It's great that we're not ending up in an inconsistent state, but i was expecting sync to prevent this.

-- Kevin



More information about the freebsd-fs mailing list