Possible transmit/stats problem in igb driver.

Sreekanth Rupavatharam rupavath at juniper.net
Thu Jun 9 05:04:05 UTC 2016


Well, that wasn't the issue. However there are some other details. The device is
DH8900CC(0x8086:0x43a) quad nic serdes interface. The issue happens when the device is used in passthrough mode inside a VM. The guest OS is running FreeBSD 10.1 and the host is Linux. There is no easy way to run this test in bare metal mode. Another point I confirmed is that the descriptor is consumed by the hardware(I get igb_txeof calls for the packets). The issue is not happening in the previously unified em driver(before igb driver was created)
Thanks,

-Sreekanth

On Jun 3, 2016, at 1:22 PM, Jack Vogel <jfvogel at gmail.com<mailto:jfvogel at gmail.com>> wrote:

That's an interesting theory, you could add a check into the tx path looking for a zero m_len and see, seems unlikely though :)

Jack



On Fri, Jun 3, 2016 at 1:15 PM, Sreekanth Rupavatharam <rupavath at juniper.net<mailto:rupavath at juniper.net>> wrote:
Wondering if this can happen if somehow the mbuf->m_len is not correct(e.g., 0) and thus causing the dma to fail silently. The only way this is happening if the arp request is larger than 64 bytes and the arp response code is reusing the packet to send a 64 byte response.

Thanks,

-Sreekanth


On 6/2/16, 2:41 PM, "hiren panchasara" <hiren at strugglingcoder.info<mailto:hiren at strugglingcoder.info>> wrote:

>+ Sean, Eric
>
>On 06/02/16 at 09:11P, Sreekanth Rupavatharam wrote:
>> Inline
>>
>> >Apart from stats, do you see anything else going wrong? i.e. do you
>> >actually see less packets (arp replies??) than expected?
>>
>> [SR] The packets are not going out on the wire. The tool doesn?t receive the packets. That?s how I started noticing the issue.
>>
>> >Taking your example, tx_packets is something we count in the drivers and
>> >total_pkts_txd is calculated in the card and we just read it off of it
>> >to report (E1000_TPT).
>>
>> [SR] Correct. My main question would be under what circumstance would the packet handed off to hardware will *not* be transmitted?. Especially considering there are no transmit errors or pause frames received. There are no dma tx failures either. That?s the baffling part. I tried another exercise where I used ping of various sizes going out, but that doesn?t seem to trigger the problem.
>>
>>
>> >To understand your setup better, ixia is the sender and your box with
>> >igb(4) is the receiver and your are sending arp requests to it.
>>
>> Yes, correct.
>>
>> >Can you post following for working (size <= 64bytes) and non-working
>> >(size > 64bytes) cases for before/after?
>> >
>> >sysctl dev.igb | grep tx_packets
>> >sysctl dev.igb | grep total_pkts_txd
>> >sysctl dev.igb | grep rx_packets
>> >sysctl dev.igb | grep total_pkts_recvd
>>
>>
>> Before(not working):
>> dev.igb.1.queue0.tx_packets: 24907933
>> dev.igb.1.queue0.rx_packets: 18086575
>> dev.igb.1.mac_stats.total_pkts_recvd: 25057359
>> dev.igb.1.mac_stats.total_pkts_txd: 16647169
>>
>> After(not working):
>> dev.igb.1.queue0.tx_packets: 24913324
>> dev.igb.1.queue0.rx_packets: 18091832
>> dev.igb.1.mac_stats.total_pkts_recvd: 25062618
>> dev.igb.1.mac_stats.total_pkts_txd: 16647545
>> >netstat -sp arp
>>
>> The difference is  5391 for queue0.tx_packets but for mac_stats.total_pkts_txd  is 376
>> Everything else is matching up.
>>
>> Before (working)
>> dev.igb.1.queue0.tx_packets: 25359165
>> dev.igb.1.queue0.rx_packets: 18526094
>> dev.igb.1.mac_stats.total_pkts_recvd: 25508763
>> dev.igb.1.mac_stats.total_pkts_txd: 16831587
>>
>>
>> After(working)
>> dev.igb.1.queue0.tx_packets: 25364597
>> dev.igb.1.queue0.rx_packets: 18531398
>> dev.igb.1.mac_stats.total_pkts_recvd: 25514009
>> dev.igb.1.mac_stats.total_pkts_txd: 16836833
>>
>>
>> Another interesting stat is
>> before_notworking:dev.igb.1.interrupts.tx_queue_empty: 16646890
>> after_notworking:dev.igb.1.interrupts.tx_queue_empty: 16647266
>>
>> The difference here is exactly 376 which is the number of packets that the device actually claims to have transmitted. It?s as though it didn?t see the other packets en-queued in the ring descriptor.
>>
>
>Very interesting. Do you tune defaults at all? What does sysctl hw.igb
>say? Not sure if bumping up txd would help.
>
>Adding Sean and Eric to throw some light.
>
>>
>> I can?t do netstat just for arp as these are coming in a tunnel(Packets don?t? show up as arp on the interface). However, I did see the packet rate was about 500 packets/sec
>>
>
>Cheers,
>Hiren




More information about the freebsd-net mailing list