HEADS UP! : sf(4)

Pyun YongHyeon pyunyh at gmail.com
Sun Jan 20 23:02:59 PST 2008


Dear all,

I've committed overhauled sf(4) to HEAD. Overhauled sf(4) has many
performance improvements over previous version and supports Rx/Tx
checksum offload with the help of firmware. Since I've changed
very fundamental things such as descriptor formats in driver there
could be unnoticed bugs in it. If you happen to encounter any
issues related with sf(4) please let me know.
I've included the commit message below.

Overhaul sf(4) to make it run on all architectures and implement
checksum offoload by downloading AIC-6915 firmware. Changes are
 o Header file cleanup. 
 o Simplified probe logic.
 o s/u_int{8,16,32}_t/uint{8,16,32}_t/g
 o K&R -> ANSI C.
 o In register access function, added support both memory mapped and
   IO space register acccess. The function will dynamically detect
   which method would be choosed.
 o sf_setperf() was modified to support strict-alignment
   architectures.
 o Use SF_MII_DATAPORT instead of hardcoded value 0xffff.
 o Added link state/speed, duplex changes handling task q. The task q
   is also responsible for flow control settings.
 o Always hornor link up/down state reported by mii layers. The link
   state information is used in sf_start() to determine whether we
   got a valid link.
 o Added experimental flow-control setup. It was commented out but
   will be activated once we have flow-cotrol infrastructure in mii
   layer.
 o Simplify IFF_UP/IFCAP_POLLING and IFF_PROMISC handling logic. Rx
   filter always honors promiscuous mode.
 o Implemented suspend/resume methods.
 o Reorganized Rx filter routine so promiscuous mode changes doesn't
   require interface re-initialization.
 o Reimplemnted driver probe routine such that it looks for matching
   device from supported hardware list table. This change will help to
   add newer hardware revision to the driver.
 o Use ETHER_ADDR_LEN instead of hardcoded value.
 o Prefer memory space register mapping over I/O space as the hardware
   requires lots of register access to get various consumer/producer
   index. Failing to get memory space mapping, sf(4) falls back to I/O
   space mapping. Use of memory space register mapping requires
   somewhat large memory space(512K), though.
 o Switch to simpler bus_{read,write}_{1,2,4}.
 o Use PCIR_BAR macro to get BARs.
 o Program PCI cache line size if the cache line size was set to 0
   and enable PCI MWI.
 o Add a new sysctl node 'dev.sf.N.stats' that shows various MAC
   counters for Rx/Tx statistics.
 o Add a sysctl node to configure interrupt moderation timer. The
   timer defers interrupts generation until time specified in timer
   control register is expired. The value in the timer register is in
   units of 102.4us. The allowable range for the timer is 0 - 31
   (0 ~ 3.276ms).
   The default value is 1(102.4us). Users can change the timer value
   with dev.sf.N.int_mod sysctl(8) variable/loader(8) tunable.
 o bus_dma(9) conversion
    - Enable 64bit DMA addressing.
    - Enable 64bit descriptor format support.
    - Apply descriptor ring alignment requirements(256 bytes alignment).
    - Apply Rx buffer address alignment requirements(4 bytes alignment).
    - Apply 4GB boundary restrictions(Tx/Rx ring and its completion ring
      should live in the same 4GB address space.)
    - Set number of allowable number of DMA segments to 16. In fact,
      AIC-6915 doesn't have a limit for number of DMA segments but it
      would be waste of Tx descriptor resource if we allow more than 16.
    - Rx/Tx side bus_dmamap_load_mbuf_sg(9) support.
    - Added alignment fixup code for strict-alignment architectures.
    - Added endianness support code in Tx/Rx descriptor access.
    With these changes sf(4) should work on all platforms.
 o Don't set if_mtu in device attach, it's handled in ether_ifattach.
 o Use our own callout to drive watchdog timer.
 o Enable VLAN oversized frames and announce sf(4)'s VLAN capability
   to upper layer.
 o In sf_detach(), remove mtx_initialized KASSERT as it's not possible
   to get there without initialzing the mutex. Also mark that we're
   about to detaching so active bpf listeners do not panic the system.
 o To reduce PCI register access cycles, Rx completion ring is
   directly scanned instead of reading consumer/producer index
   registers. In theory, Tx completion ring also can be directly
   scanned. However the completion ring is composed of two types
   completion(1 for Tx done and 1 and DMA done). So reading producer
   index via register access would be more safer way to detect the
   ring wrap-around.
 o In sf_rxeof(), don't use m_devget(9) to align recevied frames. The
   alignment is required only for strict-alignment architectures and
   now the alignment is handled by sf_fixup_rx() if required. The
   removal of the copy operation in fast path should increase Rx
   performance a lot on non-strict-alignemnt architectures such as
   i386 and amd64.
 o In sf_newbuf(), don't set descriptor valid bit as sf(4) is
   programmed to run with normal mode. In normal mode, the valid bit
   have no meaning. The valid bit should be used only when the
   hardware uses polling(prefetch) mode. The end of descriptor queue
   bit could be used if needed, but sf(4) relys on auto-wrapping of
   hardware on 256 descriptor queue entries so both valid and
   descriptor end bit are not used anymore.
 o Don't disable generation of Tx DMA completion as said in datasheet
   and use the Tx DMA completion entry instead of relying on Tx done
   completion entry. Also added additional Tx completion entry type
   check in Tx completion handler.
 o Don't blindly reset watchdog timer in sf_txeof(). sf(4) now unarm
   the the watchdog only if there are no active Tx descriptors in Tx
   queue.
 o Don't manually update various counters in driver, instead, use
   built-in MAC statistic registers to update them. The statistic
   registers are updated in every second.
 o Modified Tx underrun handlers to increase the threshold value
   in units of 256 bytes. Previously it used to increase 16 bytes
   at a time which seems to take too long to stabalize whenever Tx
   underrun occurrs.
 o In interrupt handler, additional check for the interrupt is
   performed such that interrupts only for this device is allowed to
   process descriptor rings. Because reading SF_ISR register clears
   all interrtups, nuke writing to a SF_ISR register.
 o Tx underrun is abonormal condition and SF_ISR_ABNORMALINTR includes
   the interrupt. So there is no need to inspect the Tx underrun again
   in main interrupt loop.
 o Don't blindly reinitialize hardware for abnormal interrupt
   condition. sf(4) reintializes the hardware only when it encounters
   DMA error which requires an explicit hardware reinitialization.
 o Fix a long standing bug that incorrectly clears MAC statistic
   registers in sf_init_locked.
 o Added strict-alignment safe way of ethernet address reprogramming
   as IF_LLADDR may return unaligned address.
 o Move sf_reset() to sf_init_locked in order to always reset the
   hardware to a known state prior to configuring hardware.
 o Set default Rx DMA, Tx DMA paramters as shown in datasheet.
 o Enable PCI busmaster logic and autopadding for VLAN frames.
 o Rework sf_encap.
     - Previously sf(4) used to type 0 of Tx descriptor with padding
       enabled to store driver private data. Emebedding private data
       structures into descriptors is bad idea as the structure size
       would be different between 64bit and 32bit architectures. The
       type 0 descriptor allows fixed number of DMA segments in
       a descriptor format and provides relatively simple interface to
       manage multi-fragmented frames.
       However, it wastes lots of Tx descriptors as not all frames are
       fragmented as the number of allowable segments in a descriptor.
     - To overcome the limitation of type 0 descriptor, switch to type
       2 descriptor which allows 64bit DMA addressing and can handle
       unliumited number of fragmented DMA segments. The drawback of
       type 2 descriptor is in its complexity in managing descriptors
       as driver should handle the end of Tx ring manually.
    -  Manually set Tx desciptor queue end mark and record number of
       used descriptors to reclaim used descriptors in sf_txeof().
 o Rework sf_start.
     - Honor link up/down state before attempting transmission.
     - Because sf(4) uses only one of two Tx queues, use low priority
       queue instead of high one. This will remove one shift operation
       in each Tx kick command.
     - Cache last produder index into softc such that subsequenet Tx
       operation doesn't need to access producer index register.
 o Rewrote sf_stats_update to include all available MAC statistic
   counters.
 o Employ AIC-6915 firmware from Adaptec and implement firmware
   download routine and TCP/UDP checksum offload.
   Partial checksum offload support was commented out due to the
   possibility of firmware bug in RxGFP.
   The firmware can strip VLAN tag in Rx path but the lack of firmware
   assistance of VLAN tag insertion in transmit side made it useless
   on FreeBSD. Unlike checksum offload, FreeBSD requires both Tx/Rx
   hardware VLAN assistance capability. The firmware may also detect
   wakeup frame and can wake system up from states other than D0.
   However, the lack of wakeup support form D3cold state keep me from
   adding WOL capability. Also detecting WOL frame requires firmware
   support but it's not yet known to me whether the firmware can
   process the WOL frame.
 o Changed *_ADDR_HIADDR to *_ADDR_HI to match other definitions of
   registers.
 o Added definitioan to interrupt moderation related constants.
 o Redefined SF_INTRS to include Tx DMA done and DMA errors. Removed
   Tx done as it's not needed anymore.
 o Added definition for Rx/Tx DMA high priority threshold.
 o Nuked unused marco SF_IDX_LO, SF_IDX_HI.
 o Added complete MAC statistic register definition.
 o Modified sf_stats structure to hold all MAC statistic regiters.
 o Nuke various driver private padding data in Tx/Rx descriptor
   definition. sf(4) no longer requires private padding. Also remove
   unused padding related definitions. This greatly simplifies
   descriptor manipulation on 64bit architectures.
 o Becase we no longer pad driver private data into descriptor,
   remove deprecated/not-applicable comments for padding.
 o Redefine Rx/Tx desciptor status. sf(4) doesn't use bit fileds
   anymore to support endianness.

Tested by:	bruffer (initial version)


Thanks.
-- 
Regards,
Pyun YongHyeon


More information about the freebsd-current mailing list