[Bug 228087] F_SETLK randomly fails on NFS4 in threaded operation in MySQL
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Wed May 9 04:25:40 UTC 2018
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=228087
Bug ID: 228087
Summary: F_SETLK randomly fails on NFS4 in threaded operation
in MySQL
Product: Base System
Version: 11.1-STABLE
Hardware: Any
OS: Any
Status: New
Severity: Affects Many People
Priority: ---
Component: kern
Assignee: bugs at FreeBSD.org
Reporter: barry.boes at acciodata.com
Tried in 10.4, 11.1-RELEASE, 11.1-STABLE, and 11.2-PRERELEASE client and
server. Currently client and server are 11.2-PRERELEASE.
Ktrace shows the following :
66181 mysqld CALL close(0x30)
66181 mysqld RET openat 48/0x30
66181 mysqld CALL fcntl(0x30,F_SETLK,0x7fffdd3e5cc0)
66181 mysqld RET close 0
66181 mysqld RET fcntl -1 errno 13 Permission denied
Examining a full trace, the files being locked are never locked twice by MySQL
or locked by another process. The file closed in the first line is a different
file than that opened in the second line. MySQL does this same operation tens
or hundreds of thousands of times successfully then fails on one. From all of
the trace data that I've been able to gather, the FCNTL works 100% of the time
IF the close returns before another thread calls open and F_SETLK and fails
100% of the time that the SETLK completes before the close returns in another
thread.
Observation affects the results. Failure occurs tens to hundreds of times
more rapidly when not tracing the process.
The higher the network latency, the more likely it is to happen. With a
latency of 200uS, it happens in seconds on a loaded server. With a latency of
100us, it happens in tens of seconds. With a latency of 20uS it happens
rarely, and below 15uS I have yet to see this failure.
No kernel messages are logged. I have duplicated the problem on a variety of
hardware, from 28 core Supermicro motherboards with ECC memory and E5-2XXX V4's
to laptops with i3's, 5's, or 7's.
The filesystem setup is as follows :
server : ZFS on 11.2-PRERELEASE configured for very low latency (optimized SSDs
and persistent write caches or sync=disabled).
The filesystem is either a base ZFS filesystem or a clone of a snapshot (for
easy testing, it happens on either).
The client mounts the server system via NFS4 and also runs 11-2-PRERELEASE.
Tested with 100Mb, gigabit, 50 gigabit, and 100Gigabit NICs.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs
mailing list