Throughput problems with dummynet and high delays

Randall Stewart rrs at cisco.com
Tue Oct 31 15:25:01 UTC 2006


Andy:

Hi, you must be working with Injong :-)

At one point I was playing with dummynet in satellite networks.
One of the KEY problems I found is that the number of
packets you can have in queue, limited to 100, was not
nearly enough. This was (at the time) a hardcoded parameter
inside both the ipfw code as well as the dummynet code.

What I did is changed this to be a #define inside the
ip_dummynet.h file.. What this then allowed was to set
the value up a bit higher... I used 1200 for my 550ms+
sat network...

With just rough math you need about 89,478 - 1500 bytes
packets per second to get a gig a bit.

So thats about 894 per 100ms.. but I would want more than that I am
thinking... Of course you might be able to reduce that
with 9000 byte mtus...  but even at 9000 byte mtu
100 packets will NOT cut it...

My old patches were from way back in the 4.10 erra.. I could
dig around and see if I can find something.. but I would
be able to only give you something for:

6.1 or 7.0... I don't have a 6.0 machine around anywhere ..
and soon I will be moving my 6.1 -> 6.2 :-)

Let me know if you want me to poke around.. or of course
you could too.. its not a hard change to make :-)

R



Andy Jones wrote:
>  Hi,
> 
> I'm a researcher at the University of North Carolina trying to simulate
> certain link characteristics using dummynet pipes in ipfw. Our end goal is
> to thoroughly test high speed TCP variants in our experimental network in a
> wide range of situations (which includes varying the delay from 1ms to
> 200ms).
> 
> I have two Dell PowerEdge 2850 servers connected to each other using an
> Intel Gigabit ethernet card (although I'm not sure of the exact model). 
> They
> both run FreeBSD 6.0. I'm using iperf to push as many bits through the wire
> as possible.
> 
> Without dummynet, sustained throughput is as expected, close to 1Gbps
> [  3]  0.0-180.0 sec  19.2 GBytes    918 Mbits/sec
> 
> When dummynet is used to add delay (100ms in my case) to the network, the
> machines have problems sustaining high throughput.
> 
> Here are the setup on the receiver end
> % sysctl kern.ipc.maxsockbuf=16M
> % sysctl net.inet.tcp.recvspace=12MB
> % iperf -s
> 
> and on the sender end
> % sysctl kern.ipc.maxsockbuf=16M
> % sysctl net.inet.tcp.sendspace=12MB
> % ipfw pipe 1 config delay 100
> % ipfw add 10 pipe 1 ip from any to any out
> % iperf -c <machine> [args ...]
> 
> kern.ipc.nmbclusters has also been tuned to 65536 at boot time. Our kernel
> is also has HZ=1000. The ipfw rule is added such that it is the first rule
> in the chain. 12MB is about the right size send buffer for the
> bandwidth-delay product (1Gbps * 0.1 RTT / 8bits/byte). We're also using an
> MTU of roughly 9000 bytes.
> 
> What happens is as the TCP window grows larger (about 3-4MB), the sender
> spends most of its time processing interrupts (80-90% as reported by top)
> and throughput peaks at about 300Mbps. I've dug into the dummynet code and
> I've found that a large amount of time is spent in the routine
> transmit_event(struct dn_pipe *p) which dequeues packets from a pipe and
> calls ip_output. It appears that ip_output is the culprit, but what it is
> doing with its time, I'm not sure. Packet drops are not being lost 
> according
> to TCP and dummynet. I suspect either pfil_run_hooks(...) or (*
> ifp->if_output) (...) calls in ip_output are taking too much time, but I'm
> not sure.
> 
> Any suggestions on what could be happening would be appreciated!
> 
>  -Andy Jones
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
> 


-- 
Randall Stewart
NSSTG - Cisco Systems Inc.
803-345-0369 <or> 803-317-4952 (cell)


More information about the freebsd-net mailing list