bad NFS/UDP performance
Danny Braniss
danny at cs.huji.ac.il
Sat Oct 4 06:40:48 UTC 2008
>
> On Fri, 3 Oct 2008, Danny Braniss wrote:
>
> >> On Fri, 3 Oct 2008, Danny Braniss wrote:
> >>
> >>> gladly, but have no idea how to do LOCK_PROFILING, so some pointers would
> >>> be helpfull.
> >>
> >> The LOCK_PROFILING(9) man page isn't a bad starting point -- I find that
> >> the defaults work fine most of the time, so just use them. Turn the enable
> >> syscl on just before you begin a run, and turn it off immediately
> >> afterwards. Make sure to reset between reruns (rebooting to a new kernel
> >> is fine too!).
> >
> > in ftp://ftp.cs.huji.ac.il/users/danny/lock.prof
> > there 3 files:
> > 7.1-100 host connected at 100 running -prerelease
> > 7.1-1000 same but connected at 1000
> > 7.0-1000 -stable with your 'patch'
> > at 100 my benchmark didn't suffer from the profiling, average was about 9.
> > at 1000 the benchmark got realy hit, average was around 12 for the patched,
> > and 4 for the unpatched (less than at 100).
>
> Interesting. A bit of post-processing:
>
> robert at fledge:/tmp> cat 7.1-1000 | awk -F' ' '{print $3" "$9}' | sort -n |
> tail -10
> 2413283 /r+d/7/sys/kern/kern_mutex.c:141
> 2470096 /r+d/7/sys/nfsclient/nfs_socket.c:1218
> 2676282 /r+d/7/sys/net/route.c:293
> 2754866 /r+d/7/sys/kern/vfs_bio.c:1468
> 3196298 /r+d/7/sys/nfsclient/nfs_bio.c:1664
> 3318742 /r+d/7/sys/net/route.c:1584
> 3711139 /r+d/7/sys/dev/bge/if_bge.c:3287
> 3753518 /r+d/7/sys/net/if_ethersubr.c:405
> 3961312 /r+d/7/sys/nfsclient/nfs_subs.c:1066
> 10688531 /r+d/7/sys/dev/bge/if_bge.c:3726
> robert at fledge:/tmp> cat 7.0-1000 | awk -F' ' '{print $3" "$9}' | sort -n |
> tail -10
> 468631 /r+d/hunt/src/sys/nfsclient/nfs_nfsiod.c:286
> 501989 /r+d/hunt/src/sys/nfsclient/nfs_vnops.c:1148
> 631587 /r+d/hunt/src/sys/nfsclient/nfs_socket.c:1198
> 701155 /r+d/hunt/src/sys/nfsclient/nfs_socket.c:1258
> 718211 /r+d/hunt/src/sys/kern/kern_mutex.c:141
> 1118711 /r+d/hunt/src/sys/nfsclient/nfs_bio.c:1664
> 1169125 /r+d/hunt/src/sys/nfsclient/nfs_subs.c:1066
> 1222867 /r+d/hunt/src/sys/kern/vfs_bio.c:1468
> 3876072 /r+d/hunt/src/sys/netinet/udp_usrreq.c:545
> 5198927 /r+d/hunt/src/sys/netinet/udp_usrreq.c:864
>
> The first set above is with the unmodified 7-STABLE tree, the second with a
> reversion of read locking on the UDP inpcb. The big blinking sign of interest
> is that the bge interface lock is massively contended in the first set of
> output, and basically doesn't appear in the second. There are various reasons
> bge could stand out quite so much -- one possibly is that previously, the udp
> lock serialized all access to the interface from the send code, preventing the
> send and receive paths from contending.
>
> A few things to try:
>
> - Let's look compare the context switch rates on the two benchmarks. Could
> you run vmstat and look at the cpu cs line during the benchmarks and see how
> similar the two are as the benchmarks run? You'll want to run it with
> vmstat -w 1 and collect several samples per benchmark, since we're really
> interested in the distribution rather than an individual sample.
>
> - Is there any chance you could drop an if_em card into the same box and run
> the identical benchmarks with and without LOCK_PROFILING to see whether it
> behaves differently than bge when the patch is applied? if_em's interrupt
> handling is quite different, and may significantly affect lock use, and
> hence contention.
at the moment, the best I can do is run it on a different hardware that has if_em,
the results are in
ftp://ftp.cs.huji.ac.il/users/danny/lock.prof/7.1-1000.em
the benchmark ran better with the Intel NIC, averaged UDP 54MB/s, TCP 53MB/s
(I get the same numbers with an older kernel).
danny
More information about the freebsd-stable
mailing list