Problem with fclose() returning error (EBADF)

Robert Blayzor rblayzor at inoc.net
Thu Sep 23 06:51:59 PDT 2004


I have a multithreaded application running on FreeBSD 4.9, .10 and 
-STABLE that I'm having an issue with.

The application writes large amounts of small files over an NFS mount 
and randomly we're seeing fclose() return a failure code, -1 and 
errorno, EBADF.

We have no idea what may be causing the problem.  The NFS server appears 
  to be functioning fine, no errors at all, it runs perfectly over tons 
of other clients.

At first we thought maybe that the fd was getting munged somehow, but 
here is the weird part.

If the code is changed to do an fflush() on the fd immediately before we 
issue an fclose(), fflush NEVER returns an error and always completes 
successfully.  However, completely rnadomly fclose() will return an 
error condition and errno of EBADF.

There are hundreds of gigs and inodes available on the NFS server and 
writes work fine from all other NFS clients at the time. (this is a six 
server mail cluster)

We've double checked the compile flags and I've gone through all the 
libc calls I can think of.  And I've linked my own debugging into the 
libc_r close function and it's not showing 'any' closes occuring between 
the fopen and fclose that fails.

We've also checked the flags of the FILE *f, structure, it is still 
correct so it has not been munged by anything.

There are lots of conditions where the error EBADF is returned by the
kernel etc... and I suspect one of them is not really a sign of a  bad 
file handle but means something else, but I don't know any way to find 
what is really occuring and if it is serious or just a faulty return code.

Doing a KTRACE on this may be the only option, but the problem is, the 
application is SO busy and the problem only happens randomly it'd be 
impossible to find if/when it happens.  ie: thousands and thousands of 
files can be written successfully before we actually see a failed one.

Any help or guidance would be greatly apprecaited.

TIA

-- 
Robert Blayzor, BOFH
INOC, LLC
rblayzor at inoc.net
PGP: http://www.inoc.net/~dev/
Key fingerprint = 1E02 DABE F989 BC03 3DF5  0E93 8D02 9D0B CB1A A7B0

Quality assurance: A way to ensure you never deliver shoddy goods 
accidentally.


More information about the freebsd-threads mailing list