tcp_output starving -- is due to mbuf get delay?
Jin Guojun [DSD]
j_guojun at lbl.gov
Wed Apr 9 16:12:08 PDT 2003
Some details was left behind --
The machine is 2 GHz Intel P4 with 1 GB memory, so the delay is not from
either CPU or lack of memory.
-Jin
"Jin Guojun [DSD]" wrote:
> When testing GigE path that has 67 ms RTT, the maximum TCP throughput is
> limited at 250 Mb/s. By tracing the problem, I found that tcp_output() is
> starving
> where snd_wnd and snd_cwnd are fully open. The snd_cc is never filled beyond
> the 4.05MB even though the snd_hiwat is 10MB and snd_sbmax is 8MB. That is,
> sosend never stopped at sbwait. So only place can slow down is the mbuf
> allocation
> in sosend(). The attached trace file shows that each MGET and MCLGET takes
> significant time -- around 8 us at slow start time, and gradually increasing
> after that
> in an range 18 to 648 us.
> Each packet Tx on GigE takes 12 us. It average mbuf allocation takes 18 us, then
>
> the performance will be reduced to 40%, in fact it is down to 25%, which means
> higher average delay.
>
> I have change NMBCLUSTER from 2446 to 6566 to 10240, and nothing is improved.
>
> Any one can tell what factors would cause MGET / MCLGET to wait?
> Is there any way to make MGET/MCLGET not to wait?
>
> -Jin
>
> ----------- system info -------------
>
> kern.ipc.maxsockbuf: 10485760
> net.inet.tcp.sendspace: 8388608
> kern.ipc.nmbclusters: 10240
> kern.ipc.mbuf_wait: 32
> kern.ipc.mbtypes: 2606 322 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> kern.ipc.nmbufs: 40960
>
> -------------- code trace and explanation ----------
>
> sosend()
> {
> ...
> if (space < resid + clen &&
> (atomic || space < so->so_snd.sb_lowat || space < clen)) {
> if (so->so_state & SS_NBIO)
> snderr(EWOULDBLOCK);
> sbunlock(&so->so_snd);
> error = sbwait(&so->so_snd); /***** never come
> down to here ****/
> splx(s);
> if (error)
> goto out;
> goto restart;
> }
> splx(s);
> mp = ⊤
> space -= clen;
> do {
> if (uio == NULL) {
> /*
> * Data is prepackaged in "top".
> */
> resid = 0;
> if (flags & MSG_EOR)
> top->m_flags |= M_EOR;
> } else do {
> if (top == 0) {
> microtime(&t1);
> MGETHDR(m, M_WAIT, MT_DATA);
> if (m == NULL) {
> error = ENOBUFS;
> goto release;
> }
> mlen = MHLEN;
> m->m_pkthdr.len = 0;
> m->m_pkthdr.rcvif = (struct ifnet *)0;
> } else {
> MGET(m, M_WAIT, MT_DATA);
> if (m == NULL) {
> error = ENOBUFS;
> goto release;
> }
> mlen = MLEN;
> }
> if (resid >= MINCLSIZE) {
> MCLGET(m, M_WAIT);
> if ((m->m_flags & M_EXT) == 0)
> goto nopages;
> mlen = MCLBYTES;
> len = min(min(mlen, resid), space);
> } else {
> nopages:
> len = min(min(mlen, resid), space);
> /*
> * For datagram protocols, leave room
> * for protocol headers in first mbuf.
> */
> if (atomic && top == 0 && len < mlen)
> MH_ALIGN(m, len);
> }
> microtime(&t2);
> td = time_diff(&t2, &t1);
> if ((td > 5 && (++tcnt & 31) == 0) || td > 50)
> log( ... "td %d %d\n", td, tcnt);
>
> ...
>
> } /* end of sosend */
More information about the freebsd-performance
mailing list