From nobody Fri Jun 17 16:52:41 2022 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 74BA58510DE for ; Fri, 17 Jun 2022 16:52:46 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4LPlTV28vQz4YSb for ; Fri, 17 Jun 2022 16:52:46 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 2B07E235A4 for ; Fri, 17 Jun 2022 16:52:46 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 25HGqkbF039633 for ; Fri, 17 Jun 2022 16:52:46 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 25HGqkqg039632 for bugs@FreeBSD.org; Fri, 17 Jun 2022 16:52:46 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 246886] [sendfile] Nginx + NFS or FUSE causes VM stall Date: Fri, 17 Jun 2022 16:52:41 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 12.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: firk@cantconnect.ru X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@freebsd.org MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1655484766; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZDIxgr0qN3MtL59ZbkJK2Aw0vjooYKI0Dwfx3Lgnq3k=; b=rSDi2txF7MB3haVrlbN59TD86HnAQHeaUtgfGAY0MmdaU/orSRagg/ggClJVR06M4GcGGC ha8oBQqr/PEE/GTMqFZFHqoILzjsIOTWmx3+tQPKGyEhcn1IVFfQwlW/q1XbVPGfYZYxtL DhJ9gPTdV3aPzlvz2NhLSSi0FbS2QzsQCDrTJhxtUiLIKRxcwSsu/sQ0T2bveFlNYY5HyX oCWz6pxeoYnS3fwJ5/I5WjZiceSqjqVFpNJ9oifHXjU3rMGfxDtp9PoIusNSvZ3cv+TSa5 gPWfqRK+p0noRl7AbIhqbTomAgRF0MK32O6KCK2iKCfpZpVqplLrQEdtjtXo4g== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1655484766; a=rsa-sha256; cv=none; b=qZ+g/psqRzaZHBcmDN5YnGIuE5RhI2cSHKfiKkUwc2hoSySdpufyLjf/KZErrFyIDuW8eG aR3OI9xvMaVTadvg3C4jbh8q1JKkvQhZVXCVzzpek025mmDKHQnLSAeCTxUuI3UiCcRcIG yTuUSI7YKtJ8riRsEr978NHtubPlj3UTb5RK3607DaWDwBhQb+JZ09IhYW+gDuk9vZ+AcR XPih0K+o/0LETkuKGLjrcjKQ/hBUldhrXYP2qOr/PbDRofd6l5vLbW/ePBh2HzPyc1Sxem qqavXyGOEtfa++X5XPGenJ8xt07H+52c3jb0GDBmOrvzah1JAYj0UekNCO1Rmw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D246886 --- Comment #70 from firk@cantconnect.ru --- So, the source of the problem seems was base r337165 . I didn't tested revisions around, but I did tested rollback of this specific comit, setting f_iosize back to PAGE_SIZE=3D4096 and the problem is gone. But this is not = the bug itself, it is just a trigger for another problem(s). As already noted, backtrace for the deadlock is: vm_page_grab_pages+0x3f2 allocbuf+0x371 (vfs_vmio_extend inlined inside) getblkx+0x5be breadn_flags+0x3d vfs_bio_getpages+0x403 fuse_vnop_getpages+0x46 VOP_GETPAGES_APV+0x7b vop_stdgetpages_async+0x49 VOP_GETPAGES_ASYNC_APV+0x7b vnode_pager_getpages_async+0x7d vn_sendfile+0xdf2 (sendfile_swapin inlined inside) sendfile+0x12b amd64_syscall+0x387 fast_syscall_common+0xf8 What happens: 1) sendfile_swapin() grabs and exclusively-busies bunch of pages via vm_page_grab_pages(); 2) it then scans them sequentially, unbusies already loaded ones, and calls vm_pages_get_pages_async() for not yet loaded ones, which should load them = and call sendfile_iodone() callback, and that how it was in 11.x; 3) vm_pages_get_pages_async() calls some other nested functions, and now we= are in vfs_bio_getpages(). Note: despite the "async" name all this done synchronously; 4) vfs_bio_getpages() still have vm_page[] array and its size as arguments, passed unchanged straightly from sendfile_swapin(); it downgrades exclusive-busy state to shared-busy for the given pages range; 5) the next step (bread_gb -> breadn_flags) is done using block index and s= ize obtained from fusefs driver via get_lblkno() and get_blksize() callbacks, a= nd the new block size is 65536 by default. And, going through getblkx() -> allocbuf() -> vfs_vmio_extend(), the last one calls vm_page_grab_pages() ag= ain, but the range is not the requested one, but the one matches fusefs block si= ze, effectively aligned to 16-block boundary (65536 =3D 16*4096). This leads to deadlock because the pages after currently requested are still exclusively-= busy (see p.2) What could be fixed: 1) easiest: rollback iosize to PAGE_SIZE, but this will reduce i/o speed ba= ck 2) rework sendfile_swapin() to first scan entire range for being loaded or = not and only then calling queued vm_pager_get_pages_async(); don't think it is = good because everythink already works when fusefs/nfs not used. 3) make "async" functions really async (see p.3) for fusefs; i don't know i= f it easy or not - this will resolve deadlock too because vfs_bio_getpages() will not block the sequential scan of requested pages by sendfile_swapin() 4) prevent partially loaded filesystem f_iosize blocks from happening; agai= n, I don't know is it easy or even desirable or not. PS: I don't know how all this works or not in 13.x --=20 You are receiving this mail because: You are the assignee for the bug.=