open() and ESTALE error
John
jwd at bsdwins.com
Fri Jun 20 16:02:54 PDT 2003
----- Terry Lambert's Original Message -----
> Specifically, see the underline part of:
>
> > > + if (error == ESTALE && stale++ == 0)
> ---------------
>
> ...he exits it after retrying it fails, and falls into the
> standard ESTALE return case.
>
> If this gets committed (which I think it shouldn't because I
> can see a genuinely bad handle getting converted to a good one
> in a couple of cases), that line should probably be rewritten
> to be more obvious (e.g. move the "stale++" before the "if"
> statement and adjust the compare to compensate for the difference
> so no one else reads it the way we did).
hi folks,
After looking at his original patch, I suggested modifying
it for clarity to be of the form:
error = vn_open(&nd, flags, cmode);
if (error == ESTALE)
error = vn_open(&nd, flags, cmode); /* single retry */
While I understand a number of you have reservations against
this change, I think it worth serious consideration. Unless
someone is willing to go into each of the individual fs layers
and deal with ESTALE, this appears to be a relatively straight
forward and easy to understand approach.
Most of the main applications I run on clusters have all
had their open routines recoded similar to the following (this
from ftpd):
int try = 0;
while ((fin = fopen(name,"r")) == NULL && errno == ESTALE && try < 3 ) {
if (logging > 1)
syslog(LOG_INFO,"fopen(\"%s\"): %m: attempting retry",name);
}
if (fin == NULL && logging > 1)
syslog(LOG_INFO,"get fopen(\"%s\"): %m",name);
This is a real problem when using fbsd in high load / high
throughput situations where highly sequenced operations are
performed on a common set of data files from multiple machines. An
example of this environment can be seen here:
http://www.freebsd.org/~jwd/images/cluster.jpg
If no one has any patches which can provide a better solution
for handling ESTALE I would like to see Andreys' patch given
a chance.
Of course, if we don't want to do this, then I think it is
high time we documented that open(2) can return ESTALE and provide
a library routine that wraps open() with a retry :-)
-John
More information about the freebsd-hackers
mailing list