misc/185487: Corrupted files with vfs.unmapped_buf_allowed=1

Sun Jan 5 11:40:00 UTC 2014

>Number:         185487
>Category:       misc
>Synopsis:       Corrupted files with vfs.unmapped_buf_allowed=1
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Jan 05 11:40:00 UTC 2014
>Closed-Date:
>Last-Modified:
>Originator:     Anders Berggren
>Release:        10.0-RC4
>Organization:
Halon Security
>Environment:
FreeBSD sp-build10-i386 10.0-RC4 FreeBSD 10.0-RC4 #0 r260130: Tue Dec 31 20:44:17 UTC 2013     root at snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC  i386

>Description:
We experienced strange problems with the FreeBSD 10 RC, with builds that failed and Postgres databases that got corrupted. We were able to reduce the problem down to:

i=0
while true
	do
	i=`expr $i + 1`
	echo RUN $i
	dd if=/dev/random of=disktest bs=1m count=1
	orig=`md5 -q disktest`
	cp disktest disktest2
	md5 -c $orig disktest2
	[ $? -ne 0 ] && echo fail && exit
done

which failed for us (very randomly, sometimes 10 iterations, sometimes 10000, sometimes never). The issue disappeared when disabling vfs unmapped_buf. We've tested it on different i386 machines, both in VMware and Hyper-V. Not bare-metal, though. When examining corrupted data, it appeared"repeated", as repeated pattern in the file of previously read/written data blocks.
>How-To-Repeat:
Run the script example above, for us it failed in 10-100000 iterations. We were also able to repeat the problem on real-world, high-traffic PostgreSQL databases.
>Fix:
vfs.unmapped_buf_allowed="0" in loader.conf

>Release-Note:
>Audit-Trail:
>Unformatted: