watchdogd stat location

Ian Lepore ian at freebsd.org
Sat Sep 28 19:30:41 UTC 2019


On Fri, 2019-09-27 at 15:31 -0600, Warner Losh wrote:
> On Fri, Sep 27, 2019 at 2:30 PM mike tancsa <mike at sentex.net> wrote:
> 
> > On 9/27/2019 3:53 PM, Warner Losh wrote:
> > > > 
> > > 
> > >     I am all for that too. Just something other than /etc or /var
> > >     which are
> > >     often mounted on ramdisk.
> > > 
> > > 
> > > I think that / is too special to cause disk IO to ever happen.
> > > Other
> > > dirs will sometimes not be in the cache.... The notion here,
> > > perhaps
> > > bogus, is that we want to check the root FS is sane. The stat(2)
> > > is a
> > > cheap way to do this that will eventually fail if / goes wonky
> > > enough.
> > > It's weak.
> > > 
> > > 
> > 
> > Would something like this buy any extra sanity ? or not worth it. I
> > guess fancier checks belong in a passed program
> > 
> > 
> > # diff -u watchdogd.c.orig watchdogd.c
> > --- watchdogd.c.orig    2019-09-27 16:27:14.456973000 -0400
> > +++ watchdogd.c 2019-09-27 16:27:18.904885000 -0400
> > @@ -364,9 +364,23 @@
> > 
> >                 if (test_cmd != NULL)
> >                         failed = system(test_cmd);
> > -               else
> > -                       failed = stat("/etc", &sb);
> > -
> > +               else {
> > +
> > +                       srand(time(NULL));
> > +                       switch(rand() % 4) {
> > +                               case 0:
> > +                                       failed = stat("/", &sb);
> > +                                       break;
> > +                               case 1:
> > +                                       failed = stat("/bin", &sb);
> > +                                       break;
> > +                               case 2:
> > +                                       failed = stat("/sbin",
> > &sb);
> > +                                       break;
> > +                               default:
> > +                                       failed = stat("/usr", &sb);
> > +                       }
> > +               }
> >                 error = watchdog_getuptime(&ts_end);
> >                 if (error) {
> >                         end_program = 1;
> > 
> 
> I don't think the rand helps at all. I think you'd really rather do
> things
> sequentially. And this introduces more assumptions about the
> underlying
> filesystem(s).
> 
> Warner
> 

If we want to be sure to force physical IO, how about dd if=/
of=/dev/null count=1 ?

But I question the premise of forcing physical IO as being somehow a
better indicator of a non-hung system.  I think it's just a better
indicator of the sdcard problem that Mike is experiencing.  For anyone
else, forcing periodic physical IO is going to do annoying things like
spin up idle drives.

-- Ian



More information about the freebsd-embedded mailing list