open() and ESTALE error

John jwd at bsdwins.com
Fri Jun 20 16:02:54 PDT 2003


----- Terry Lambert's Original Message -----
> Specifically, see the underline part of:
> 
> > > +             if (error == ESTALE && stale++ == 0)
>                                       ---------------
> 
> ...he exits it after retrying it fails, and falls into the
> standard ESTALE return case.
> 
> If this gets committed (which I think it shouldn't because I
> can see a genuinely bad handle getting converted to a good one
> in a couple of cases), that line should probably be rewritten
> to be more obvious (e.g. move the "stale++" before the "if"
> statement and adjust the compare to compensate for the difference
> so no one else reads it the way we did).

hi folks,

   After looking at his original patch, I suggested modifying
it for clarity to be of the form:

   error = vn_open(&nd, flags, cmode);
   if (error == ESTALE)
      error = vn_open(&nd, flags, cmode); /* single retry */


   While I understand a number of you have reservations against
this change, I think it worth serious consideration. Unless
someone is willing to go into each of the individual fs layers
and deal with ESTALE, this appears to be a relatively straight
forward and easy to understand approach.

   Most of the main applications I run on clusters have all
had their open routines recoded similar to the following (this
from ftpd):

   int try = 0;
   while ((fin = fopen(name,"r")) == NULL && errno == ESTALE && try < 3 ) {
      if (logging > 1)
         syslog(LOG_INFO,"fopen(\"%s\"): %m: attempting retry",name);
   }
   if (fin == NULL && logging > 1)
      syslog(LOG_INFO,"get fopen(\"%s\"): %m",name);   


   This is a real problem when using fbsd in high load / high
throughput situations where highly sequenced operations are
performed on a common set of data files from multiple machines. An
example of this environment can be seen here:

http://www.freebsd.org/~jwd/images/cluster.jpg

   If no one has any patches which can provide a better solution
for handling ESTALE I would like to see Andreys' patch given
a chance.

   Of course, if we don't want to do this, then I think it is
high time we documented that open(2) can return ESTALE and provide
a library routine that wraps open() with a retry :-)

-John



More information about the freebsd-hackers mailing list