From nobody Fri Nov 19 18:19:39 2021 X-Original-To: net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id BAAEB18A7DBB for ; Fri, 19 Nov 2021 18:19:42 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4HwlLk4hm7z3PbB; Fri, 19 Nov 2021 18:19:42 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from [192.168.0.88] (unknown [195.64.148.76]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) (Authenticated sender: avg/mail) by smtp.freebsd.org (Postfix) with ESMTPSA id 0A3C661CF; Fri, 19 Nov 2021 18:19:41 +0000 (UTC) (envelope-from avg@FreeBSD.org) Message-ID: <0dbe63d0-3219-846d-4c58-0bf219f41634@FreeBSD.org> Date: Fri, 19 Nov 2021 20:19:39 +0200 List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:91.0) Gecko/20100101 Firefox/91.0 Thunderbird/91.3.0 Content-Language: en-US To: "net@FreeBSD.org" Cc: Mark Johnston , Patrick Kelsey From: Andriy Gapon Subject: vmxnet3: possible bug in vmxnet3_isc_rxd_pkt_get Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-ThisMailContainsUnwantedMimeParts: N We have infrequent but persistent crashes at work where a part of an iflib_rxq_t object seems to be overwritten by a sequence of if_rxd_frag objects. Those if_rxd_frag elements look like a continuation of the ifr_frags array in an adjacent iflib_rxq_t object. This happens only on VMWare only with vmxnet3 driver. Based on the code analysis I think that the only place where the overwrite / overrun can happen is vmxnet3_isc_rxd_pkt_get(). It writes to the ifr_frags array (passed in via if_rxd_info_t::iri_frags) and there is no sanity check that nfrags is below IFLIB_MAX_RX_SEGS. I guess that the driver relies on a packet never having more than IFLIB_MAX_RX_SEGS fragments / segments in a descriptor ring because that value is programmed into nrxsg_max. So, I am not sure if this is a problem in the driver or a quirk in VMware. Here is some data to demonstrate the issue: $1 = (iflib_rxq_t) 0xfffffe00ea9f6200 (kgdb) p $1->ifr_frags[0] $2 = {irf_flid = 0 '\000', irf_idx = 1799, irf_len = 118} (kgdb) p $1->ifr_frags[1] $3 = {irf_flid = 1 '\001', irf_idx = 674, irf_len = 0} (kgdb) p $1->ifr_frags[2] $4 = {irf_flid = 1 '\001', irf_idx = 675, irf_len = 0} ... elements 3..62 follow the same pattern ... (kgdb) p $1->ifr_frags[63] $6 = {irf_flid = 1 '\001', irf_idx = 736, irf_len = 0} and then... (kgdb) p $1->ifr_frags[64] $7 = {irf_flid = 1 '\001', irf_idx = 737, irf_len = 0} (kgdb) p $1->ifr_frags[65] $8 = {irf_flid = 1 '\001', irf_idx = 738, irf_len = 0} ... the pattern continues ... (kgdb) p $1->ifr_frags[70] $10 = {irf_flid = 1 '\001', irf_idx = 743, irf_len = 0} It seems like a start-of-packet completion descriptor referenced a descriptor in the command ring zero (and apparently it didn't have the end-of-packet bit). And there were another 70 zero-length completions referencing the ring one until the end-of-packet. So, in total 71 fragment was recorded. Or it's possible that those zero-length fragments were from the penultimate pkt_get call and ifr_frags[0] was obtained after that... I am not sure how that could happen. I am thinking about adding a sanity check for the number of fragments. Not sure yet what options there are for handling the overflow besides panicing. Also, some data from the vmxnet3's side of things: (kgdb) p $15.vmx_rxq[6] $18 = {vxrxq_sc = 0xfffff80002d9b800, vxrxq_id = 6, vxrxq_intr_idx = 6, vxrxq_irq = {ii_res = 0xfffff80002f23e00, ii_rid = 7, ii_tag = 0xfffff80002f23d80}, vxrxq_cmd_ring = {{vxrxr_rxd = 0xfffffe00ead3c000, vxrxr_ndesc = 2048, vxrxr_gen = 0, vxrxr_paddr = 57917440, vxrxr_desc_skips = 1114, vxrxr_refill_start = 1799}, {vxrxr_rxd = 0xfffffe00ead44000, vxrxr_ndesc = 2048, vxrxr_gen = 0, vxrxr_paddr = 57950208, vxrxr_desc_skips = 121, vxrxr_refill_start = 743}}, vxrxq_comp_ring = {vxcr_u = {txcd = 0xfffffe00ead2c000, rxcd = 0xfffffe00ead2c000}, vxcr_next = 0, vxcr_ndesc = 4096, vxcr_gen = 1, vxcr_paddr = 57851904, vxcr_zero_length = 1044, vxcr_pkt_errors = 128}, vxrxq_rs = 0xfffff80002d78e00, vxrxq_sysctl = 0xfffff80004308080, vxrxq_name = "vmx0-rx6\000\000\000\000\000\000\000"} vxrxr_refill_start values are consistent with what is seen in ifr_frags[]. vxcr_zero_length and vxcr_pkt_errors are both not zero, so maybe something got the driver into a confused state or the emulated hardware became confused. -- Andriy Gapon