OpenAFS on FreeBSD 8.1

Benjamin Kaduk kaduk at MIT.EDU
Thu Jul 29 03:36:35 UTC 2010


Hi Jan,

Sorry for the long delay in responding -- mail piled up a bit during a 
busy week.

On Fri, 23 Jul 2010, Jan Henrik Sylvester wrote:

> On 07/23/2010 12:30, Jan Henrik Sylvester wrote:
>> I listed a few directories without blocks for longer periods of time as
>> with my last testing. Good. Copying a huge file from AFS was terribly
>> slow (even for my DSL connection), but it steadily progressed and I was
>> able to abort it without deadlocking or crashing. Copying a 16MB file to
>> AFS blocked a parallel "ls -l" on the same directory I was copying to,

I'm pretty sure that we're holding an exclusive vnode lock when we're not 
supposed to, but haven't looked into why the lock diagnostics don't 
complain about it.

>> but it eventually finished. The file was not corrupted. Great.
>
> I did more testing from University to both of the AFS' I had been testing 
> before. Copying a few MB from AFS and copying a 16MB file to AFS was both 
> fine (showing 6MB/s while copying).
>
> Trying to copy a 512MB file to AFS locked all AFS after two seconds that it 
> was showing copy rates of 40MB/s (while the network is only 100Mbit/s). After 
> increasing the AFS cache size to 512MB, almost all of the file got copied 
> before AFS would lock. With a cache of 1GB, the file got copied without a 
> deadlock or corruption. (All this is on MP, I have not tried to disable all 
> but one core.)

Do you remember if this was with the git-based port or the 1.5.75 linked 
from the status report?  The latter has an extra patch which band-aids 
around a reference-counting bug when we need to reclaim used vnodes due to 
a space crunch.

>
> Rebooting the machine after having done nothing but the successful copy of 
> the 512MB file, I got:
> Fatal trap 12: page fault while in kernel mode

Hm, hard to do much about that without a backtrace.  I've seen occasional 
errors when shutting down afsd (various manifestations), but I'd say it 
completes successfully at least half the time (umount -f, that is).

>
> Overall, the only problems I got during my tests were copying files larger 
> than the cache size and shutting down afsd. So far, AFS seems to become 
> usable for me (even on MP).

Glad to hear things are getting better.



On Fri, 23 Jul 2010, Jan Henrik Sylvester wrote:

> 
> I did not expect my problems to have vanished, but I wanted to try again.
> 
> Should I use the git based port 
> http://stuff.mit.edu/afs/sipb.mit.edu/user/kaduk/freebsd/openafs/openafs-devel.shar.txt 
> you pointed me to earlier for testing? Or should I always use 
> http://web.mit.edu/freebsd/openafs/openafs.shar that you posted to the 
> Quarterly Status Report?

I would probably stick to the git-based port, as that will give more 
useful reports when things break (such as the one you mention below).  As 
I mentioned above, there is one patch in the latter shar which is not in 
git; it's http://gerrit.openafs.org/2321 .  You can add it to the 
git-based port by stopping after the 'make patch' stage, going into the 
work directory and running:
git pull git://git.openafs.org/openafs refs/changes/21/2321/1
and then proceeding with the configure, build, and install stages.

> 
> With both, I run into the same problem compiling on FreeBSD 8.1. 
> http://svn.freebsd.org/viewvc/base?view=revision&revision=209524 changed 
> the definition of ifa_ifwithnet. In rx/rx_kernel.h, FreeBSD 8.1 needs 
> the same definition of rx_ifaddr_withnet as AFS_OBSD46_ENV (while 
> FreeBSD 8.0 needs the generic one). Should FreeBSD 8.0 still be supported?
>

I'll try to get that fix in this weekend (if not sooner).  I only have 
9-current test boxes, and I think Derrick only has 8.0, so 8.1-specific 
things would otherwise rely on me noticing relevant changes in the commit 
emails that go by; this doesn't work very well when I don't have much time 
to read them :)

> With the git based port, I get an error on "kldload libafs": "can't load 
> libafs: Exec format error" (missing symbol?) -- openafs-1.5.75 (the 
> other port) does not seem to have this problem.
>

Sounds like someone introduced a regression since then; thanks for the 
report.

> Starting afsd, I realized that I had not updated my CellServDB and thus 
> tried to shutdown afsd, which complained about afs still being mounted. 
> Trying to umount /afs, I got a segfault in the kernel. (I had not 
> actually accessed /afs before doing that.) I guess restarting the afsd 
> is not possible for now. (No big deal.)
>

It ... should be possible, though it is not fully reliable.  Be sure to 
unload and reload the kernel module between unmounting /afs and restarting 
afsd, though.


-Ben Kaduk


> 
> pagsh does not immediately crash anymore -- another improvement, even if 
> it is minor compared to FreeBSD not crashing anymore using AFS.
> 
> BTW: Thanks for all your work!
> 
> Cheers,
> Jan Henrik
> _______________________________________________
> freebsd-afs at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-afs
> To unsubscribe, send any mail to "freebsd-afs-unsubscribe at freebsd.org"


More information about the freebsd-afs mailing list