CFT: re(4)
Pyun YongHyeon
pyunyh at gmail.com
Tue May 29 12:18:45 UTC 2007
Dear all,
I've committed a fix for bus_dma(9) bug which resulted in poor Tx
performance on TSO enabled re(4) driver. With the fix and revised
re(4) I got more sane performance on re(4). Because there are too many
hardwares that rely on re(4) I'd like to hear any success or failure
reports before revised re(4) hits the tree.
For PCIe hardware users it would be great if you can submit
performance numbers for stock re(4) and revised one. The revised
re(4) can be found at the following URL.
http://people.freebsd.org/~yongari/re/re.HEAD.patch
Note, you need latest kernel to get correct performance numbers.
Changes:
o For 8169 GigEs increased Rx/Tx descriptors to 256 because it's hard
to push the hardware to the limit with default 64 descriptors.
TSO requires large number of Tx descriptors to pass a full sized TCP
segment(65535 bytes IP packet) to hardware. Previously it consumed
32 Tx descriptors, assuming MCLBYTES DMA segment size, to send the
TCP segment which means re(4) couldn't queue more than two full
sized IP packets.
For 8139C+ it still uses 64 Rx/Tx descriptors due to its hardware
limitations. With this changes there are (very) small waste of
memory for 8139C+ users but I don't think it would affect 8139C+
users for most cases.
o Various bus_dma(9) fix.
- The hardware supports DAC so allow 64bit DMA operations.
- Removed BUS_DMA_ALLOC_NOW flag. The use of the flag is almost
always bug.
- Increased DMA segment size to 4096 from MCLBYTES as TSO consumes
too many descriptors with MCLBYTES DMA segment size.
- Tx/Rx side bus_dmamap_load_mbuf_sg(9) support. With these changes
the code is more readable than previous one and got a (slightly)
better performance as it doesn't need to pass/decode arguments
to/from callback function.
- Removed unnecessary callback function re_dmamap_desc() and
nuked rl_dmaload_arg structure which was used in the callback.
- Additional protection for DMA map load failure. In case of
failure reuse current map instead of returning a bogus DMA map.
- Deferred DMA map unloading/sync operation for maximum performance
until we really need to load new DMA map. If we happen to reuse
current map(e.g. input error) there is no need to sync/unload/
load again.
- The number of allowable Tx DMA segments for a mbuf chains are
now 32 instead of magic nseg value. If the number of available
Tx descriptors are short enough to send highly fragmented mbuf
chains an optimized re_defrag() is called to collapse mbuf chains
which is supposed to be much faster than m_defrag(9).
re_defrag() was borrowed from ath(4).
- Separated Rx/Tx DMA tag from a common DMA tag such that Rx DMA
tag correctly uses DMA maps that were created with DMA alignment
limitations(64bit alignments). Tx DMA tag does not have such
a alignment limitation.
- Added additional sanity checks for DMA ring map load failure.
- Added an additional spare Rx DMA map for graceful handling of Rx
DMA map load failure.
- Fixed misused bus_dmamap_sync(9) and added missing
bus_dmamap_sync(9) in re_encap()/re_txeof()/re_rxeof().
o Don't touch DMA address of a Tx descriptor in re_txeof(). It's not
needed.
o Fix incorrect update of if_ierrors counter. For Rx buffer shortage
it should update if_qdrops as the buffer is reused.
o Added checks for unsupported H/W revisions and return ENXIO for
these hardwares. This is required to make re_probe() resource
allocation free as other drivers do in device probe routine.
o Modified descriptor index manipulation macros as it's now possible
to have different number of descriptors for Rx/Tx.
o In re_start, to save a lock operation, use IFQ_DRV_IS_EMPTY before
trying to invoke IFQ_DRV_DEQUEUE. Also don't blindly call re_encap
since we already know the number of available Tx descriptors in
advance.
o Removed RL_TX_DESC_THLD which was used to reserve RL_TX_DESC_THLD
descriptors in Tx path. There is no such a limitation mentioned in
8139C+/8169/8110/8168/8101/8111 datasheet and it seems to work ok
without reserving RL_TX_DESC_THLD descriptors.
o Fix a comment for RL_GTXSTART. The register is 8bits register.
o Added comments for 8169/8139C+ hardware restrictions on descriptors.
o Removed forward declaration for "struct rl_softc", it's not needed.
o Added a new structure rl_txdesc for Tx descriptor managements and
a structure rl_rxdesc for Rx descriptor managements.
o Removed unused member variable rl_intlock in driver softc. There are
still several unused member variables which are supposed to be used
to access hardware statistics counters. But it seems that accessing
hardware counters were not implemented yet.
Thanks.
--
Regards,
Pyun YongHyeon
More information about the freebsd-current
mailing list