kern/86306: [patch] if_em.c locks up while trying to send a highly fragmented packet

Dmitrij Tejblum tejblum at yandex-team.ru
Sun Oct 2 13:30:18 PDT 2005


The following reply was made to PR kern/86306; it has been noted by GNATS.

From: Dmitrij Tejblum <tejblum at yandex-team.ru>
To: Ruslan Ermilov <ru at freebsd.org>
Cc: FreeBSD-gnats-submit at freebsd.org
Subject: Re: kern/86306: [patch] if_em.c locks up while trying to send a
 highly fragmented packet
Date: Mon, 3 Oct 2005 00:29:28 +0400 (MSD)

 Sorry for long delay.
 
 Ruslan Ermilov wrote:
 > On Mon, Sep 19, 2005 at 11:38:47AM +0400, Dmitrij Tejblum wrote:
 >
 >>   Ruslan Ermilov wrote:
 >>
 >> Hi Dmitrij,
 >>
 >> On Sun, Sep 18, 2005 at 11:25:35PM +0400, Dmitrij Tejblum wrote:
 >>
 >>
 >> When em_encap() tries to send a very long mbuf chain (i.e. more than
 >> EM_MAX_SCATTER == 64 mbufs), bus_dmamap_load_mbuf_sg() may fail with EFBIG.
 >> Then em_encap() fail, the packet is not sent and left in the output queue,
 >> and thus no futher transmission is possible.
 >>
 >> Some other driver handle similar condition with m_defrag(9) function
 >> (which is intended for this purpose).
 >>
 >>
 >>
 >> Can you please modify your patch as follows:
 >>
 >> 1) Count how much fragments are in the packet in em_encap() first, and
 >>    do m_defrag() if it exceeeds EM_MAX_SCATTER, like in if_dc.c.  If it
 >>    is still EFBIG after that and bus_dmamap_load_mbuf_sg(), then free it
 >>    as you do to prevent re-enqueue.
 >>
 >>
 >>   So you think that if_dc.c is a better example than e.g. if_fxp.c and
 >>   if_xl.c? Why? I also suspect that your suggestion would be a
 >>   pessimisation.
 >>
 >
 > Checking for a chain length shouldn't be much of pessimization, it's
 > very fast, and I thought it'd save code duplication.  But I don't
 > insist on this one.  If you would like to stay with try-and-fail
 > approach, I don't mind.
 >
 >
 >> 2) Put BPF processing back to em_start_locked().
 >>
 >>
 >>   As I wrote, I think that if e.g. we were unable to defragment a packet it
 >>   is better to drop it and try to send the next, rather than stop and set
 >>   OACTIVE (since the code that clear OACTIVE assume that it is about TX
 >>   descriptors). (You didn't suggest to change that).  I moved BPF processing
 >>   since it would not be a good idea to pass NULL to BPF.
 >>
 >
 > I don't understand how moving BPF processing in em_encap() helps this,
 > and how we can pass a NULL pointer if BPF processing stayed where it was,
 > in start().  Also, in your patch, if either DMA loading of mbuf fails or
 > mbuf chain defragmentation fails, NULL will be returned and no next
 > packet will be processed (though I agree that in the latter case it
 > would be logical to try the next packet).
 
 em_encap() returns an error code (though the value is only checked for
 zero and otherwise unused). If em_encap() returns a (non-zero) error,
 em_start_locked() breaks out of the loop, and if em_encap() returns
 "success" (zero), it continues the loop and, with the code in CVS,
 send the mbuf to BPF. When I drop packet I want to continue the loop
 and return 0, so I have to deal with BPF handling somehow.
 
 Below is updated patch. It is generated against -current, but also
 applicable to RELENG_5, since the driver is the same in the places I
 touch. (Frankly I only tried to run it with RELENG_5 kernel.)
 Code duplication is decreased. I also moved DMA setup before VLAN
 processing, since I noticed that the VLAN processing might change the mbuf
 chain. Also I changed the VLAN processing to return 0 in case of mbuf
 failures for abovementioned reasons.
 
 Index: sys/dev/em/if_em.c
 ===================================================================
 RCS file: /home/ncvs/src/sys/dev/em/if_em.c,v
 retrieving revision 1.73
 diff -u -p -r1.73 if_em.c
 --- sys/dev/em/if_em.c	29 Sep 2005 13:23:34 -0000	1.73
 +++ sys/dev/em/if_em.c	2 Oct 2005 20:03:21 -0000
 @@ -623,13 +623,6 @@ em_start_locked(struct ifnet *ifp)
  			break;
                  }
 
 -		/* Send a copy of the frame to the BPF listener */
 -#if __FreeBSD_version < 500000
 -                if (ifp->if_bpf)
 -                        bpf_mtap(ifp, m_head);
 -#else
 -		BPF_MTAP(ifp, m_head);
 -#endif
 
                  /* Set timeout in case hardware has problems transmitting */
                  ifp->if_timer = EM_TX_TIMEOUT;
 @@ -1209,36 +1202,6 @@ em_encap(struct adapter *adapter, struct
                  }
          }
 
 -        /*
 -         * Map the packet for DMA.
 -         */
 -        if (bus_dmamap_create(adapter->txtag, BUS_DMA_NOWAIT, &map)) {
 -                adapter->no_tx_map_avail++;
 -                return (ENOMEM);
 -        }
 -        error = bus_dmamap_load_mbuf_sg(adapter->txtag, map, m_head, segs,
 -					&nsegs, BUS_DMA_NOWAIT);
 -        if (error != 0) {
 -                adapter->no_tx_dma_setup++;
 -                bus_dmamap_destroy(adapter->txtag, map);
 -                return (error);
 -        }
 -        KASSERT(nsegs != 0, ("em_encap: empty packet"));
 -
 -        if (nsegs > adapter->num_tx_desc_avail) {
 -                adapter->no_tx_desc_avail2++;
 -                bus_dmamap_destroy(adapter->txtag, map);
 -                return (ENOBUFS);
 -        }
 -
 -
 -        if (ifp->if_hwassist > 0) {
 -                em_transmit_checksum_setup(adapter,  m_head,
 -                                           &txd_upper, &txd_lower);
 -        } else
 -                txd_upper = txd_lower = 0;
 -
 -
          /* Find out if we are in vlan mode */
  #if __FreeBSD_version < 500000
          if ((m_head->m_flags & (M_PROTO1|M_PKTHDR)) == (M_PROTO1|M_PKTHDR) &&
 @@ -1261,22 +1224,19 @@ em_encap(struct adapter *adapter, struct
 
  		m_head = m_pullup(m_head, sizeof(eh));
  		if (m_head == NULL) {
 -			*m_headp = NULL;
 -                	bus_dmamap_destroy(adapter->txtag, map);
 -			return (ENOBUFS);
 +			adapter->mbuf_alloc_failed++;
 +			goto drop2;
  		}
  		eh = *mtod(m_head, struct ether_header *);
  		M_PREPEND(m_head, sizeof(*evl), M_DONTWAIT);
  		if (m_head == NULL) {
 -			*m_headp = NULL;
 -                	bus_dmamap_destroy(adapter->txtag, map);
 -			return (ENOBUFS);
 +			adapter->mbuf_alloc_failed++;
 +			goto drop2;
  		}
  		m_head = m_pullup(m_head, sizeof(*evl));
  		if (m_head == NULL) {
 -			*m_headp = NULL;
 -                	bus_dmamap_destroy(adapter->txtag, map);
 -			return (ENOBUFS);
 +			adapter->mbuf_alloc_failed++;
 +			goto drop2;
  		}
  		evl = mtod(m_head, struct ether_vlan_header *);
  		bcopy(&eh, evl, sizeof(*evl));
 @@ -1288,6 +1248,54 @@ em_encap(struct adapter *adapter, struct
  		*m_headp = m_head;
  	}
 
 +        /*
 +         * Map the packet for DMA.
 +         */
 +        if (bus_dmamap_create(adapter->txtag, BUS_DMA_NOWAIT, &map)) {
 +                adapter->no_tx_map_avail++;
 +                return (ENOMEM);
 +        }
 +        error = bus_dmamap_load_mbuf_sg(adapter->txtag, map, m_head, segs,
 +					&nsegs, BUS_DMA_NOWAIT);
 +        if (error != 0 && error != EFBIG) {
 +                printf("em%d: can't map mbuf (error %d)\n",
 +                        adapter->unit, error);
 +                adapter->no_tx_dma_setup++;
 +                goto drop;
 +        } else if (error == EFBIG) {
 +                struct mbuf *mn;
 +                mn = m_defrag(m_head, M_DONTWAIT);
 +                if (mn == NULL) {
 +                        printf("em%d: can't defrag mbuf\n", adapter->unit);
 +                        adapter->mbuf_alloc_failed++;
 +                        goto drop;
 +                }
 +                m_head = mn;
 +                error = bus_dmamap_load_mbuf_sg(adapter->txtag, map, m_head,
 +                                                segs, &nsegs, BUS_DMA_NOWAIT);
 +                if (error != 0) {
 +                        printf("em%d: can't map mbuf2 (error %d)\n",
 +                                adapter->unit, error);
 +                        adapter->no_tx_dma_setup++;
 +                        goto drop;
 +                }
 +        }
 +        KASSERT(nsegs != 0, ("em_encap: empty packet"));
 +
 +        if (nsegs > adapter->num_tx_desc_avail) {
 +                adapter->no_tx_desc_avail2++;
 +                bus_dmamap_destroy(adapter->txtag, map);
 +                return (ENOBUFS);
 +        }
 +
 +
 +        if (ifp->if_hwassist > 0) {
 +                em_transmit_checksum_setup(adapter,  m_head,
 +                                           &txd_upper, &txd_lower);
 +        } else
 +                txd_upper = txd_lower = 0;
 +
 +
          i = adapter->next_avail_tx_desc;
  	if (adapter->pcix_82544) {
  		txd_saved = i;
 @@ -1389,7 +1397,21 @@ em_encap(struct adapter *adapter, struct
                  }
          }
 
 +        /* Send a copy of the frame to the BPF listener */
 +#if __FreeBSD_version < 500000
 +        if (ifp->if_bpf)
 +               bpf_mtap(ifp, m_head);
 +#else
 +        BPF_MTAP(ifp, m_head);
 +#endif
          return(0);
 +drop:
 +        m_freem(m_head);
 +drop2:
 +        *m_headp = NULL;
 +        bus_dmamap_destroy(adapter->txtag, map);
 +        return (0);
 +
  }
 
  /*********************************************************************
 @@ -3264,7 +3286,8 @@ em_update_stats_counters(struct adapter
  	adapter->stats.mpc + adapter->stats.cexterr;
 
  	/* Tx Errors */
 -	ifp->if_oerrors = adapter->stats.ecol + adapter->stats.latecol;
 +	ifp->if_oerrors = adapter->stats.ecol + adapter->stats.latecol +
 +		adapter->no_tx_dma_setup + adapter->no_tx_map_avail;
 
  }
 
 
 >
 >
 > Cheers,
 
 


More information about the freebsd-bugs mailing list