tcp_output starving -- is due to mbuf get delay?

Petri Helenius pete at he.iki.fi
Thu Apr 10 13:50:47 PDT 2003


There was a discussion on mballoc performance on freebsd-net about a month ago
but it has since died without conclusion.

Pete

----- Original Message ----- 
From: "Jin Guojun [DSD]" <j_guojun at lbl.gov>
To: <freebsd-hackers at freebsd.org>; <freebsd-performance at freebsd.org>
Sent: Thursday, April 10, 2003 2:12 AM
Subject: Re: tcp_output starving -- is due to mbuf get delay?


> Some details was left behind --
> 
>     The machine is 2 GHz Intel P4 with 1 GB memory, so the delay is not from
> either CPU or lack of memory.
> 
>     -Jin
> 
> "Jin Guojun [DSD]" wrote:
> 
> > When testing  GigE path that has 67 ms RTT, the maximum TCP throughput is
> > limited at 250 Mb/s. By tracing the problem, I found that tcp_output() is
> > starving
> > where snd_wnd and snd_cwnd are fully open. The snd_cc is never filled beyond
> > the 4.05MB even though the snd_hiwat is 10MB and snd_sbmax is 8MB. That is,
> > sosend never stopped at sbwait. So only place can slow down is the mbuf
> > allocation
> > in sosend(). The attached trace file shows that each MGET and MCLGET takes
> > significant time -- around 8 us at slow start time, and gradually increasing
> > after that
> > in an range 18 to 648 us.
> > Each packet Tx on GigE takes 12 us. It average mbuf allocation takes 18 us, then
> >
> > the performance will be reduced to 40%, in fact it is down to 25%, which means
> > higher average delay.
> >
> > I have change NMBCLUSTER from 2446 to 6566 to 10240, and nothing is improved.
> >
> > Any one can tell what factors would cause MGET / MCLGET to wait?
> > Is there any way to make MGET/MCLGET not to wait?
> >
> >     -Jin
> >
> > ----------- system info -------------
> >
> > kern.ipc.maxsockbuf: 10485760
> > net.inet.tcp.sendspace: 8388608
> > kern.ipc.nmbclusters: 10240
> > kern.ipc.mbuf_wait: 32
> > kern.ipc.mbtypes: 2606 322 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> > kern.ipc.nmbufs: 40960
> >
> > -------------- code trace and explanation ----------
> >
> > sosend()
> > {
> > ...
> >                 if (space < resid + clen &&
> >                     (atomic || space < so->so_snd.sb_lowat || space < clen)) {
> >                         if (so->so_state & SS_NBIO)
> >                                 snderr(EWOULDBLOCK);
> >                         sbunlock(&so->so_snd);
> >                         error = sbwait(&so->so_snd);        /***** never come
> > down to here ****/
> >                         splx(s);
> >                         if (error)
> >                                 goto out;
> >                         goto restart;
> >                 }
> >                 splx(s);
> >                 mp = &top;
> >                 space -= clen;
> >                 do {
> >                     if (uio == NULL) {
> >                         /*
> >                          * Data is prepackaged in "top".
> >                          */
> >                         resid = 0;
> >                         if (flags & MSG_EOR)
> >                                 top->m_flags |= M_EOR;
> >                     } else do {
> >                         if (top == 0) {
> > microtime(&t1);
> >                                 MGETHDR(m, M_WAIT, MT_DATA);
> >                                 if (m == NULL) {
> >                                         error = ENOBUFS;
> >                                         goto release;
> >                                 }
> >                                 mlen = MHLEN;
> >                                 m->m_pkthdr.len = 0;
> >                                 m->m_pkthdr.rcvif = (struct ifnet *)0;
> >                         } else {
> >                                 MGET(m, M_WAIT, MT_DATA);
> >                                 if (m == NULL) {
> >                                         error = ENOBUFS;
> >                                         goto release;
> >                                 }
> >                                 mlen = MLEN;
> >                         }
> >                         if (resid >= MINCLSIZE) {
> >                                 MCLGET(m, M_WAIT);
> >                                 if ((m->m_flags & M_EXT) == 0)
> >                                         goto nopages;
> >                                 mlen = MCLBYTES;
> >                                 len = min(min(mlen, resid), space);
> >                         } else {
> > nopages:
> >                                 len = min(min(mlen, resid), space);
> >                                 /*
> >                                  * For datagram protocols, leave room
> >                                  * for protocol headers in first mbuf.
> >                                  */
> >                                 if (atomic && top == 0 && len < mlen)
> >                                         MH_ALIGN(m, len);
> >                         }
> > microtime(&t2);
> > td = time_diff(&t2, &t1);
> > if ((td > 5 && (++tcnt & 31) == 0) || td > 50)
> >     log( ... "td %d %d\n", td, tcnt);
> >
> > ...
> >
> > } /* end of sosend */
> 
> _______________________________________________
> freebsd-performance at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> To unsubscribe, send any mail to "freebsd-performance-unsubscribe at freebsd.org"
> 



More information about the freebsd-performance mailing list