Re: cvs commit: src/sys/conf options src/sys/netinet ip_output.c

From: Luigi Rizzo <rizzo_at_icir.org>
Date: Thu, 27 Mar 2003 15:33:52 -0800
[about coalescing long chains of mbufs...]

On Thu, Mar 27, 2003 at 11:56:47AM -0800, Sam Leffler wrote:

[have read the rest of the thread too]

> I thought about this some more.  If the purpose of m_defrag* is to handle
...
> driver unless you make a second pass.  Given that this problem should happen
> infrequently and that you're just trying to avoid discarding the packet I
> think you're best off doing a best effort job where you coalesce only mbufs
> and leave clusters/ext's alone.  Then if the subsequent bus_dma_load_mbuf
> call fails you discard the packet.  Other than the read/write requirements

the problem is that the code path leading to these situation behave
sistematically like this -- ie. you are likely to keep dropping the packet
again and again, so dropping may not be an option.

Anyways, one path where this came out was

	tcp_usr_send() --> sbappend() --> sbcompress()

and i think if the code decides to call sbcompress(),
then the latter should try and do its job as good as it can,
including allocating a cluster when the caller does a
pathological number of short writes, or merging segments
trying to reduce the waste factor to something reasonable

E.g. note that sbcompress() does the following:

	while (m) {
		...
                if (n && (n->m_flags & M_EOR) == 0 &&
                    M_WRITABLE(n) &&
                    m->m_len <= MCLBYTES / 4 && /* XXX: Don't copy too much */
                    m->m_len <= M_TRAILINGSPACE(n) &&
                    n->m_type == m->m_type) {
			... do the copy and free m
			continue;
		}
		...
	}

so individual writes of 513+ bytes will result in wasting up to 75%
of the socket buffer space. At the very least, i would drop the
'm->m_len <= MCLBYTES / 4' check to reduce the waste.

	cheers
	luigi
Received on Thu Mar 27 2003 - 15:33:58 UTC