pcap woes (was: FreeBSD qemu-devel 0.12.0-rc2 port update available for testing)

Sat Dec 26 21:08:50 UTC 2009

On Mon, Dec 14, 2009 at 11:06:48PM +0100, Juergen Lock wrote:
> Hi!
> 
>  I updated my git head snapshot qemu-devel port update to 0.12.0-rc2
> today (that was just announced:
> 	http://lists.gnu.org/archive/html/qemu-devel/2009-12/msg01514.html
> - the Subject says rc1 but in fact its rc2) so people can test that
> version on FreeBSD more easily:
> 	http://people.freebsd.org/~nox/qemu/qemu-devel-0.12.0-rc2.patch
> resp.
> 	http://people.freebsd.org/~nox/qemu/qemu-devel-0.12.0-rc2.shar
> 
>  (As mentioned before 0.11 was the last qemu branch that supported kqemu
> so this is probably only interesting for those FreeBSD users that want
> to emulate non-x86 guests or when performance doesn't matter.  But the
> others are probably already moving to virtualbox now anyway... :)
> 
>  I have updated the FreeBSD pcap patch just enough so that it still runs
> (it probably will never be committed upstream anyway since Linux pcap
> doesn't have BIOCFEEDBACK i.e. can't talk to the host, only to other
> machines on the network) and I still see this weird issue here that
> packets `sometimes' are only processed with a delay (when the nic is
> otherwise idle?), i.e. pinging the host or another box on the lan with e.g.
> -i5 from the guest sees many packets with >5000ms roundtrip time.  Can
> anyone else reproduce this or is that `just me'?  This is on stable/8
> now (amd64) but it also happened with earlier versions, tested with
> 	qemu 8.0-RELEASE-i386-dvd1.iso -m 256 -net nic,model=e1000 -net pcap,ifname=em0
> via fixit->cdrom.  And this is most likely pcap related, I don't see
> anything like it with tap networking.

I found another issue:  pcap_send (actuall the callback pcap_callback())
sometimes receives too large packets (or multiple packets in one go?),
causing a linux guest's e1000 driver to panic like below when wget'ing
a file from the host.  This hack applied on top of the pcap patch
(simply truncating the packets) makes the panics go away:  (but still
doesn't fix the delays)

--- a/net.c
+++ b/net.c
@@ -841,11 +841,20 @@ static ssize_t pcap_receive(VLANClientSt
     return pcap_inject(s->handle, (u_char*)buf, size);
 }
 
+#define MAX_ETH_FRAME_SIZE 1514
+
 static void pcap_callback(u_char *user, struct pcap_pkthdr *phdr, u_char *pdata)
 {
     VLANClientState *vc = (VLANClientState *)user;
+    int len = phdr->len;
 
-    qemu_send_packet(vc, pdata, phdr->len);
+    if (len > MAX_ETH_FRAME_SIZE) {
+        fprintf(stderr,
+            "pcap_send: packet size > %d (%d), truncating\n",
+            MAX_ETH_FRAME_SIZE, len);
+        len = MAX_ETH_FRAME_SIZE;
+    }
+    qemu_send_packet(vc, pdata, len);
 }
 
 static void pcap_send(void *opaque)

 Here comes the panic, apparently its in this line:
	http://fxr.watson.org/fxr/source/net/core/skbuff.c?v=linux-2.6#L1015

--------snip-------------
skb_over_panic: text:e08e2da8 len:1770 put:1770 head:dd761800 data:dd761822 tail:0xdd761f0c end:0xdd761e20 dev:eth0
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:127!
invalid opcode: 0000 [#1] PREEMPT SMP 
last sysfs file: /sys/devices/pci0000:00/0000:00:05.0/host0/target0:0:0/0:0:0:0/block/sda/size
Modules linked in: ipv6 ppdev lp cpufreq_stats cpufreq_powersave cpufreq_performance cpufreq_ondemand freq_table cpufreq_conservative af_packet virtio_balloon rtc_cmos psmouse evdev rtc_core pcspkr serio_raw i2c_piix4 rtc_lib i2c_core parport_pc parport processor button aufs squashfs loop nls_utf8 isofs nls_base dm_mod sg sd_mod sr_mod cdrom ata_generic pata_acpi virtio_net ata_piix libata sym53c8xx scsi_transport_spi virtio_pci e1000 floppy scsi_mod thermal fan [last unloaded: scsi_wait_scan]

Pid: 0, comm: swapper Not tainted (2.6.31-6.slh.1-sidux-686 #1) 
EIP: 0060:[<c03254da>] EFLAGS: 00000292 CPU: 0
EIP is at skb_put+0x8a/0x90
EAX: 0000007a EBX: dd761f0c ECX: 00000086 EDX: 00ee5000
ESI: 000006ea EDI: dd749580 EBP: ddc71a80 ESP: c0493da8
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process swapper (pid: 0, ti=c0492000 task=c04993a0 task.ti=c0492000)
Stack:
 c0467e4c e08e2da8 000006ea 000006ea dd761800 dd761822 dd761f0c dd761e20
<0> df02b000 000006ea 000006ee e08e2da8 df0254cc c1409bc0 c0493eac dd7495c0
<0> df02b320 00000007 df02b388 df02b610 c03cbfc0 df02b508 df02b000 df04f800
Call Trace:
 [<e08e2da8>] ? e1000_clean_rx_irq+0x268/0x4b0 [e1000]
 [<e08e2da8>] ? e1000_clean_rx_irq+0x268/0x4b0 [e1000]
 [<e08e5a7c>] ? e1000_clean+0x1ec/0x550 [e1000]
 [<c012ae4f>] ? enqueue_entity+0x11f/0x190
 [<e0979039>] ? sym_interrupt+0x29/0x750 [sym53c8xx]
 [<c032e546>] ? net_rx_action+0x126/0x230
 [<c013e10f>] ? __do_softirq+0xcf/0x1d0
 [<c011a3ea>] ? ack_apic_level+0x7a/0x270
 [<c013e24d>] ? do_softirq+0x3d/0x40
 [<c013e425>] ? irq_exit+0x65/0x80
 [<c0105a60>] ? do_IRQ+0x50/0xc0
 [<c0104089>] ? common_interrupt+0x29/0x30
 [<c011fe72>] ? native_safe_halt+0x2/0x10
 [<c010b4a8>] ? default_idle+0x68/0x130
 [<c010b94c>] ? c1e_idle+0x4c/0x110
 [<c0102a02>] ? cpu_idle+0x52/0x90
 [<c04c788a>] ? start_kernel+0x2f8/0x35b
 [<c04c736d>] ? unknown_bootoption+0x0/0x1d7
Code: 24 14 8b 81 b0 00 00 00 89 74 24 0c 89 44 24 10 8b 41 50 c7 04 24 4c 7e 46 c0 89 44 24 08 8b 44 24 2c 89 44 24 04 e8 d1 c0 09 00 <0f> 0b eb fe 66 90 55 89 c5 57 56 53 83 ec 14 89 54 24 04 89 0c 
EIP: [<c03254da>] skb_put+0x8a/0x90 SS:ESP 0068:c0493da8
---[ end trace a7dfb09442504800 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Pid: 0, comm: swapper Tainted: G      D    2.6.31-6.slh.1-sidux-686 #1
Call Trace:
 [<c03c14fb>] ? panic+0x4d/0xfd
 [<c010767c>] ? oops_end+0xbc/0xd0
 [<c0104ad0>] ? do_invalid_op+0x0/0x90
 [<c0104b4f>] ? do_invalid_op+0x7f/0x90
 [<c03254da>] ? skb_put+0x8a/0x90
 [<c02f6880>] ? serial8250_console_write+0x0/0x130
 [<c01536b1>] ? up+0x11/0x40
 [<c0138dd1>] ? release_console_sem+0x191/0x1e0
 [<c033b15a>] ? sk_filter+0x9a/0xb0
 [<c03c42d6>] ? error_code+0x66/0x6c
 [<c0104ad0>] ? do_invalid_op+0x0/0x90
 [<c03254da>] ? skb_put+0x8a/0x90
 [<e08e2da8>] ? e1000_clean_rx_irq+0x268/0x4b0 [e1000]
 [<e08e2da8>] ? e1000_clean_rx_irq+0x268/0x4b0 [e1000]
 [<e08e5a7c>] ? e1000_clean+0x1ec/0x550 [e1000]
 [<c012ae4f>] ? enqueue_entity+0x11f/0x190
 [<e0979039>] ? sym_interrupt+0x29/0x750 [sym53c8xx]
 [<c032e546>] ? net_rx_action+0x126/0x230
 [<c013e10f>] ? __do_softirq+0xcf/0x1d0
 [<c011a3ea>] ? ack_apic_level+0x7a/0x270
 [<c013e24d>] ? do_softirq+0x3d/0x40
 [<c013e425>] ? irq_exit+0x65/0x80
 [<c0105a60>] ? do_IRQ+0x50/0xc0
 [<c0104089>] ? common_interrupt+0x29/0x30
 [<c011fe72>] ? native_safe_halt+0x2/0x10
 [<c010b4a8>] ? default_idle+0x68/0x130
 [<c010b94c>] ? c1e_idle+0x4c/0x110
 [<c0102a02>] ? cpu_idle+0x52/0x90
 [<c04c788a>] ? start_kernel+0x2f8/0x35b
 [<c04c736d>] ? unknown_bootoption+0x0/0x1d7
QEMU 0.11.1 monitor - type 'help' for more information
(qemu) q