file locking.

Thu Aug 16 16:20:33 PDT 2007

On Thursday 16 August 2007 06:31:13 pm Jeff Roberson wrote:
> On Thu, 16 Aug 2007, John Baldwin wrote:
> 
> > On Thursday 16 August 2007 04:18:51 pm Jeff Roberson wrote:
> >> On Thu, 16 Aug 2007, John Baldwin wrote:
> >>
> >>> On Thursday 16 August 2007 03:01:18 am Jeff Roberson wrote:
> >>>> I have been looking at file locking for 8.0 and have come up with a way
> > to
> >>>> completely eliminate the file lock, reduce the size of struct file by
> >>>> ~4 pointers (22%), remove the global list of files and associated lock,
> >>>> and restrict the scope of unp_gc while removing several race 
conditions.
> >>>
> >>
> >> Thanks for the review.
> >>
> >>> I like finit().  I would maybe change socketpair() to pass so1 and so2 
to
> >>> finit() rather than setting f_data twice.
> >>
> >> I'm not sure what you mean?
> >
> > In socketpair() the new code does this:
> >
> > 	fp1->f_data = so1;
> > 	...
> > 	fp2->f_data = so2;
> > 	...
> > 	finit(fp1, ..., fp1->f_data, ...);
> > 	finit(fp2, ..., fp2->f_data, ...);
> >
> > It might be cleaner to do this:
> >
> > 	...
> > 	...
> > 	finit(fp1, ..., so1, ...);
> > 	finit(fp2, ..., so2, ...);
> 
> I did not want to have one file pointing at another without an initialized 
> f_data field.  However, I guess the underlying sockets are already setup 
> so this may not be important.  The code did go to some effort to setup 
> f_data early before as well so I didn't want to change that.

Until f_ops is set, f_data is irrelevant as badfileops ignores f_data.

> > So, in the old code there's a really big comment about how it makes sure 
to
> > only do closef() (via unp_discard()) once but does a sorflush() for each
> > f_msgcount.  Was that comment no longer true?
> 
> The comment actually says:
> 
>           *
>           * It is incorrect to simply unp_discard each entry for f_msgcount
>           * times
> 
> What we do is grab an extra ref to each struct file that is dead and then 
> explicitly sorflush() them.  This closes all of the references held by 
> that socket, which would free any unreferenced non-unp descriptors. 
> However, we want to prevent the algorithm from recursing in on itself so 
> we hold the extra file ref for unp sockets that would be closed.  Then 
> when we loop releasing this one last ref at the end the actually fo_close 
> will be called.
> 
> This portion of the algorithm is not significantly different from before. 
> I just introduced an extra flag so I could remove the race from dropping 
> the lock inbetween operations and get an accurate count of how big the 
> array needs to be.

Ok.

> >> Do we have an official stance on libkvm?  Now that we have sysctl for
> >> run-time it's only useful for crashdump debugging.  Really in most cases
> >> it could be replaced with a reasonable set of gcc scripts.
> >
> > s/gcc/gdb/.  At work we do mostly post-mortem analysis, so having working
> > libkvm is still very important for us.  xref the way I just fixed netstat 
to
> > work again on coredumps recently.  Breaking fstat on coredumps would 
probably
> > be very annoying.  libkvm can always use the same algo as the sysctl if
> > necessary though.
> 
> Yes, I'll do that.

Cool, thanks!

-- 
John Baldwin