truncate vs. unlink, unexpected results

Thu Dec 30 20:19:17 UTC 2010

As I stated in another question about vnodes, I have a backend server
running 8.2-PRERELEASE that processes a lot of independent files that
grow randomly and then go to zero. (Like a popserver.)

The disks are Intel SSD drives with default newfs settings-- 16k
blocks, 2k frags, soft updates.

In trying to improve perf we made an application-level change so that
instead of deleting files once they are popped, the file is kept and
truncated to zero. The thinking was that this would save the write and
metadata write to the directory information (since metadata writes are
expensive, in that they force a semi-synchronous write operation in
noasync+SU mode) at the expense of more used inodes and bigger
directories.  

The server architecture consists of two machines running the same set
of operations on the same dataset via erlang mirroring, so we were
able to change the code on one and observe the differences.

On the truncate machine, the number of write operations is now indeed
lower, as expected. It's usually 10-15% lower over long (100-second)
intervals. But the total data written went up by about 30%. svc_t went
up by 100-200% and %b was slightly higher.

Why does truncate result in more data written and worse performance
across fewer write operations? Most of the files are small, under 10k,
but some are up to 1m. The blocks have to be reclaimed either way,
right? My understanding of what happens in each case:

unlink: dir modified, dir inode modified, file inode freed, blocks freed
truncate: file inode modified, blocks freed

Is it possible I'm seeing an SSD-related issue-- truncate is forcing
more expensive overwrites, whereas unlink lets the filesystem write
new data into empty space? (for a while anyway.)

I hope I'm making a simple mistake in my thinking about the fs that
someone can correct me on. Thanks.