From nobody Fri Jan 06 10:14:06 2023 X-Original-To: freebsd-jail@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4NpKDT29qkz2pJXv; Fri, 6 Jan 2023 10:23:21 +0000 (UTC) (envelope-from owner-freebsd-net@freebsd.org) Received: from delivery.e-purifier.com (delivery.e-purifier.com [41.168.131.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4NpKDR6Czxz4410; Fri, 6 Jan 2023 10:23:19 +0000 (UTC) (envelope-from owner-freebsd-net@freebsd.org) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=freebsd.org header.s=dkim header.b=ERKRiKJC; spf=softfail (mx1.freebsd.org: 41.168.131.21 is neither permitted nor denied by domain of owner-freebsd-net@freebsd.org) smtp.mailfrom=owner-freebsd-net@freebsd.org; dmarc=none; arc=pass ("freebsd.org:s=dkim:i=1") Received: from [192.168.212.33] (helo=sec-nCPT-ag03) by delivery.e-purifier.com with smtp (Exim 4.95) (envelope-from ) id 1pDjsU-0007PO-4r; Fri, 06 Jan 2023 12:23:10 +0200 Received: from mail pickup service by sec-nCPT-ag03.neotel.e-purifier.co.za with Microsoft SMTPSVC; Fri, 6 Jan 2023 12:23:08 +0200 Received: from sec-ncpt-spt04.e-purifier.com ([192.168.211.1]) by sec-nCPT-ag03.neotel.e-purifier.co.za with Microsoft SMTPSVC(7.5.7601.17514); Fri, 6 Jan 2023 12:14:41 +0200 Received: from localhost (localhost [127.0.0.1]) by sec-ncpt-spt04.e-purifier.com (Postfix) with ESMTP id C63E89DFD75 for ; Fri, 6 Jan 2023 12:14:41 +0200 (SAST) X-Virus-Scanned: by SpamTitan at e-purifier.com X-Spam-Flag: NO X-Spam-Score: 0.697 X-Spam-Level: X-Spam-Status: No, score=0.697 tagged_above=-999 required=4.5 tests=[BAYES_50=0.8, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, DMARC_MISSING=0.1, MAILING_LIST_MULTI=-0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, ST_RCVD_IN_HOSTKARMA_W=-0.001] autolearn=ham autolearn_force=no Received: from sec-ncpt-spt04.e-purifier.com (localhost [127.0.0.1]) by sec-ncpt-spt04.e-purifier.com (Postfix) with ESMTP id 27A4B9DFD7E for ; Fri, 6 Jan 2023 12:14:26 +0200 (SAST) Received-SPF: pass (freebsd.org ... _spf.freebsd.org: 96.47.72.81 is authorized to use 'freebsd-net+bounces-2840-ho=nanoteq.com@FreeBSD.org' in 'mfrom' identity (mechanism 'ip4:96.47.72.81' matched)) receiver=sec-ncpt-spt04.e-purifier.com; identity=mailfrom; envelope-from="freebsd-net+bounces-2840-ho=nanoteq.com@FreeBSD.org"; helo=mx2.freebsd.org; client-ip=96.47.72.81 Received: from mx2.freebsd.org (mx2.freebsd.org [96.47.72.81]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by sec-ncpt-spt04.e-purifier.com (Postfix) with ESMTPS id 928AD9DFD81 for ; Fri, 6 Jan 2023 12:14:20 +0200 (SAST) Received: from mx1.freebsd.org (mx1.freebsd.org [96.47.72.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits)) (Client CN "mx1.freebsd.org", Issuer "R3" (verified OK)) by mx2.freebsd.org (Postfix) with ESMTPS id 4NpK214vFQz41Q4 for ; Fri, 6 Jan 2023 10:14:17 +0000 (UTC) (envelope-from freebsd-net+bounces-2840-ho=nanoteq.com@FreeBSD.org) Received: from mlmmj.nyi.freebsd.org (mlmmj.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:24]) by mx1.freebsd.org (Postfix) with ESMTP id 4NpK213sK3z42BW for ; Fri, 6 Jan 2023 10:14:17 +0000 (UTC) (envelope-from freebsd-net+bounces-2840-ho=nanoteq.com@FreeBSD.org) Received: from mlmmj.nyi.freebsd.org (mlmmj.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:24]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4NpK1v4mGyz2pHK8; Fri, 6 Jan 2023 10:14:11 +0000 (UTC) (envelope-from freebsd-net+bounces-2840@FreeBSD.org) X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4NpK1v0Yv2z2pHFc; Fri, 6 Jan 2023 10:14:11 +0000 (UTC) (envelope-from zlei@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4NpK1v05fMz41ZY; Fri, 6 Jan 2023 10:14:11 +0000 (UTC) (envelope-from zlei@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1673000051; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=a4SKPPU3J5HnhI1Ab4inSkxLcICUlb1KdZSLjs276HQ=; b=ERKRiKJCzX/TYm3246pD427H4N8AOosfItpTZh2qCOut7vAwyGqJEkwn7dZtGQ15t2W+JT tx/tUUMY+l8e6hQ5+W0sOIEXi5vv9AcDVh7ce3NcRtvey1cCZEjdW4SuPbaTTQW1k3ieFq mVXy500s1gLhVgUArullzdkMRFF1QUiDitExRnA8XY1zuq3S5RuxKwiY7Lx8L3H6+nYZih xwFO0VuXM9YvisGw6c7gcCJz+Go1sseR/YN7OYhp0wZoORuIIKiQKsTIJ+AMIej4q+Wxq2 Ok1KCxFqWZaGHc/jd1Kge0N6G/1zfhuOfD9UlOBTrV+ykTB0nEVhXYSGFkNH6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1673000051; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=a4SKPPU3J5HnhI1Ab4inSkxLcICUlb1KdZSLjs276HQ=; b=Qibx3oCrdtMH2qq5ZSK91EXuCiNxw++y2PxfxBAcaauFxO6jZ+KxNnE5nTd1NNWtdQsi2O 3g9m2XAhABusCfDbAupV0YF870j+ghJn03MLKSW3rI8xx7PTK2uchWwpRlSHEb9fqmAlRG pm0rYeukLdy0MA3fTuaOlRkqpoNEB/CKbVZgFX/Lr6gyvZNOTh6mI111K0ITduifTt/ipT yLR25OLkwPmdSpixx3bs5sFzU8XUj3z3StKdnN2U5uUwoZHYnRslWQjBjTTKFoIXHGfmPm gnlcv+2eSnY+CmZSs+2lQdJImzOcFArTHt8ZkTJUFERaOlNgdDpSI6eSFoBKyw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1673000051; a=rsa-sha256; cv=none; b=xnePYWrZA/9Posu9BHNeFCQ6/PmVW5rHTvyb/1zLUC0m+axbUjYuoiuC9pGVxspHPyfft5 1t1cUXIeoikRTQLZKCZiuloeaLIZlO3Y9wacH9AHVQSil9K0nBWZSvnFhsQw81tJ5vO8ry TlSmJI4UBO5uTQjrzmZixUwC5XHQqGXCjK6GoDE6BniGKxX3cWLRFAtJh042S4TgvoAmG7 LQR2riz1fVKvm7NzzTj2xik5UeUVjv8YMs9le4gm1zFK9u4gR4WvvRWsu2rrzTPfTLBE6l cilJRkZpW1A+A93+J8UsGBrUEbraJJrB0pTNRJhAnimIjJAZgtPK3SJFJHF1mg== Received: from smtpclient.apple (unknown [112.66.185.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) (Authenticated sender: zlei/mail) by smtp.freebsd.org (Postfix) with ESMTPSA id 4NpK1s6KpNz173v; Fri, 6 Jan 2023 10:14:09 +0000 (UTC) (envelope-from zlei@FreeBSD.org) Content-Type: text/plain; charset=us-ascii List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: List-Id: Discussion about FreeBSD jail(8) List-Archive: https://lists.freebsd.org/archives/freebsd-jail List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-jail@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) Subject: Re: Propose a new stage `vnet_shutdown` before `vnet_destroy` From: Zhenlei Huang In-Reply-To: <1c9dbf6d26b9525243dd6b3ffafa23cb@freebsd.org> Date: Fri, 6 Jan 2023 18:14:06 +0800 Cc: freebsd-jail@freebsd.org, freebsd-net Content-Transfer-Encoding: quoted-printable Message-Id: References: <1c9dbf6d26b9525243dd6b3ffafa23cb@freebsd.org> To: James Gritton X-Mailer: Apple Mail (2.3696.120.41.1.1) X-ThisMailContainsUnwantedMimeParts: N X-OriginalArrivalTime: 06 Jan 2023 10:14:41.0927 (UTC) FILETIME=[B12B3170:01D921B7] x-archived: yes x-dbused: RGF0YSBTb3VyY2U9MTkyLjE2OC4yMTEuMjc= X-Spamd-Result: default: False [-5.97 / 15.00]; DWL_DNSWL_MED(-2.00)[freebsd.org:dkim]; NEURAL_HAM_LONG(-1.00)[-1.000]; ARC_ALLOW(-1.00)[freebsd.org:s=dkim:i=1]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; MV_CASE(0.50)[]; R_DKIM_ALLOW(-0.20)[freebsd.org:s=dkim]; MAILLIST(-0.16)[generic]; MIME_GOOD(-0.10)[text/plain]; HAS_LIST_UNSUB(-0.01)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DMARC_NA(0.00)[freebsd.org]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; MLMMJ_DEST(0.00)[freebsd-jail@freebsd.org,freebsd-net@freebsd.org]; RCVD_VIA_SMTP_AUTH(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; R_SPF_SOFTFAIL(0.00)[~all:c]; RCVD_COUNT_TWELVE(0.00)[13]; TO_DN_SOME(0.00)[]; RWL_MAILSPIKE_POSSIBLE(0.00)[41.168.131.21:from]; DKIM_TRACE(0.00)[freebsd.org:+]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FROM_NEQ_ENVFROM(0.00)[zlei@FreeBSD.org,owner-freebsd-net@freebsd.org]; ASN(0.00)[asn:36937, ipnet:41.168.128.0/17, country:ZA]; FORGED_SENDER_MAILLIST(0.00)[] X-Rspamd-Queue-Id: 4NpKDR6Czxz4410 X-Spamd-Bar: ----- X-ThisMailContainsUnwantedMimeParts: N > On Dec 19, 2022, at 1:44 AM, James Gritton wrote: >=20 > On 2022-12-18 00:01, Zhenlei Huang wrote: >> I'm currently working on route nexthop caching feature for tunneling >> interfaces such as >> if_gif, if_gre, if_vxlan, and potentially if_wg. I encounter a nasty >> bug related to VNET lifecycle. >> More preciously I'd like to call `rib_unsubscribe()` to unsubscribe >> route event when the interface >> tunnel is deleted (gif_delete_tunnel). >> While on VNET shutting down, VNET SYSUNINIT was called and the = routing >> vnet subsystem >> is destroyed before the interface going down and hence cause >> pagefault. I do not want to check >> `vnet.vnet_shutdown` state as it looks messed up. >> I'm recently reviewing the life cycles of prison and get some = inspirations. >> When the jail / prison is submitted to destroy ( by jail_remove >> syscall ) then SIGKILL is sent to >> the prison's processes. I think it is correct order to destroy jail / >> prison. To summarize, the life cycle >> of jail / prison is: >> on jail create: PRISON_STATE_INVALID -> create VNET -> >> PRISON_STATE_ALIVE -> setup network resources, ifnet, if addresses, >> routing, etc. -> create / attach (network) processes >> on jail destroy: jexec kill processes (1) by user -> mark it as >> PRISON_STATE_DYING -> send SIGKILL to processes by kernel (2) -> >> destroy VNET (if prison pr_ref go to the last one) -> DYED >> The (2) is a cleanup by kernel as (1) is possible not done by user. >> So it comes the idea about the life cycle of VNET. >> While on jail destroy, the network resources are cleaned up by >> vnet_destroy ( SYSUNINIT ). Then the >> order of SYSUNINIT of network components is hacking as circular >> network resource dependency is possible. >> For example the routing table entries (nhop) have reference of ifnet, >> and ifnet have reference to route nhop (cache), as >> I encountered. >> Just like the cleanup processes by kernel, we can introduce a new >> stage `vnet_shutdown` that clean up network resources. >> When jail / prison is going to dye, after kernel has cleaned up >> processes it call `vnet_shutdown` to cleanup network resources, >> then vnet_destroy will go smoothly as there's no circular network >> resource dependency right now. >> The life cycle of prison becomes: >> on jail create: PRISON_STATE_INVALID -> create VNET -> >> PRISON_STATE_ALIVE -> setup network resources, ifnet, if addresses, >> routing, etc. -> create / attach (network) processes >> on jail destroy: jexec kill processes (1) by user -> mark it as >> PRISON_STATE_DYING -> send SIGKILL to processes by kernel (2) -> >> vnet_shutdown cleanup network resources -> destroy VNET (if prison >> pr_ref go to the last one) -> DYED >> This idea is still unmature and I hope to hear more voices about it. >=20 > This is absolutely the direction things need to go. Vnet isn't the > only thing that can have these problems, though it's been the biggest > offender. There could also be cycles that involve more than one > subsystem, which could be helped by broad application of this idea. >=20 > There's a function in kern_jail.c ready for this: prison_cleanup. > It's called in "mark PRISON_STATE_DYING" stage of things. That's > before the "send SIGKILL" part of your sequence, but otherwise fits. >=20 Submitted to Phabricator for review: https://reviews.freebsd.org/D37956 https://reviews.freebsd.org/D37957 > - Jamie Best regards, Zhenlei