From nobody Tue Apr 02 07:56:38 2024 X-Original-To: freebsd-questions@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4V80Zk26kYz5GNQV for ; Tue, 2 Apr 2024 07:56:46 +0000 (UTC) (envelope-from mirror176@hotmail.com) Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10olkn2105.outbound.protection.outlook.com [40.92.42.105]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4V80Zj6jLpz4M4M for ; Tue, 2 Apr 2024 07:56:45 +0000 (UTC) (envelope-from mirror176@hotmail.com) Authentication-Results: mx1.freebsd.org; none ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=i1p52muM1Or/mUV+ETAtaKJhGPytT83WnaXiunKko2mFxJPEJfq/ouG2hvJv4O9hJWsEejst4FSvB+mLq7WC/QxDf/QxBZKqWuFsYMa9EBBS3GS73FVvH7YxKFj5g+XoCJB+eWvwmRGGCbWJqpK3ScBNcfDOYd174vLZce9AXPVQt0H3G+CZgC8H7VIS5+CxR+yiX6C7w4IlZgSbxFJkL1Ydw6pqMgl8BEkxupyZav/IiX0Hj/fNForOrlxOkmhr6zA3TWyYq9RJqoh70p1wXH+IMn80ZKWxR02taD7g4nZwfjWawJJ9TqZSfDEBfK9eSj0pgLCNSTTbcdnk/HuM0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=EY0m589GFb2j1k/MOzQdejRt5qI5jCdKxfuU0Pi2ncI=; b=gh+UOUzgH2d6eNP5QktF7z+nWsOQil6zsFP8eDkbrz29GSpylz+Sww8LnplpwmFC6sjXUffgD9vBjcPLdLFTxXDlZthI4wVNDy43rUHSj0pt6F77Ps3sy48DVt58gszuEyBbU5XfTb0MqSZksmuenDMbDVtrBJ5h02PsmgKGJq/afdPCL2GgTuN19Q79RDZ5d/dTeh5HDvNYhVh9uJZd16iPlAzESxwWjshKTYWsOjDQfIpJh7dIbjgGtlSHvpP2Zjobx8IgBJBTJfkmCQzB3ERwfyAY9YF2Sg+CvxnS1MxmnJMhg6txACMmUaNPr8+idGgN/2RJapKbKbVGXSdn6A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=EY0m589GFb2j1k/MOzQdejRt5qI5jCdKxfuU0Pi2ncI=; b=ol9BRXTBmEG0EFCzCXA2U1NcLJXP92AiNczvz45FZ574evLK3RnWxV5zhS2nu7R5QXuDRsBjCd+RN8bW4GeJ3TXymZTHRiATVoDagvtF8zlC9NpMGL5TRVJPDiVu78cZi5CZLpRYXLOZRfa5ePy7PgTuCwyZHTdvSl92qqR4vuEUdmGvFxxVtFxLUDw7fDSYzx+I1mM+ZLz5ANwNJr9eUCitSVoRQKKV1BSjBXMbhPR4Afrutvg+/v3e1n8d0Jw4s7lhQch8wIsmPSKH7Vpujahybyft+JmNmgfJ8yXpgzFIeuIFZgZALxqs3Ru0bGIlle9sVAi9xe61+Rt8UFgrNw== Received: from CO1PR11MB4770.namprd11.prod.outlook.com (2603:10b6:303:94::19) by SJ2PR11MB8497.namprd11.prod.outlook.com (2603:10b6:a03:57b::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7452.25; Tue, 2 Apr 2024 07:56:43 +0000 Received: from CO1PR11MB4770.namprd11.prod.outlook.com ([fe80::e526:b74c:4798:1295]) by CO1PR11MB4770.namprd11.prod.outlook.com ([fe80::e526:b74c:4798:1295%4]) with mapi id 15.20.7452.019; Tue, 2 Apr 2024 07:56:42 +0000 Message-ID: Date: Tue, 2 Apr 2024 00:56:38 -0700 User-Agent: Mozilla Thunderbird Subject: Re: 13.3 troubles under load Content-Language: en-US To: Andrea Venturoli , freebsd-questions@freebsd.org References: <1ca17a7a-025d-4403-a7f3-2892408ad628@netfence.it> From: "Edward Sanford Sutton, III" In-Reply-To: <1ca17a7a-025d-4403-a7f3-2892408ad628@netfence.it> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-TMN: [nIw8AqXDS+wuA+GBZQn/jjlcoOcx96g+] X-ClientProxiedBy: MN2PR15CA0003.namprd15.prod.outlook.com (2603:10b6:208:1b4::16) To CO1PR11MB4770.namprd11.prod.outlook.com (2603:10b6:303:94::19) X-Microsoft-Original-Message-ID: <8ce5d23c-3f95-4a0f-bf87-a661ab6dd15b@hotmail.com> List-Id: User questions List-Archive: https://lists.freebsd.org/archives/freebsd-questions List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-questions@freebsd.org X-BeenThere: freebsd-questions@freebsd.org MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PR11MB4770:EE_|SJ2PR11MB8497:EE_ X-MS-Office365-Filtering-Correlation-Id: aefd6d55-3183-4de1-3aa9-08dc52ea6eea X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: AYtGkvw868cf+QU8DKLY6/3oTy1Rgm0T7RI3miZA79VBR3J2c/r80HHWuMF3saQPPFwrfCEjW6qVpWAHPhvFFpbw/9+1O+WfS9trFV2vjyKdWba6QAN3ogDhOgIgBnW6nRRgkXZGuYrgPc9flgNoRe8UdjwFIxo0za5Tokb3UzXbxoB2wSkUbTZ2TWgdRCIRw2vdEd+N18zcBeBzBqbjPoVNcH5o6z2v/2qtZ9aeEy0+D7CB/3VUoZYfCP3Z75u7mulgIGLqu8eBlkieXZwy6Rj4ATw7u3SZHa6tiqOpWxU/j44zM0kK4XXOC4VUQ6K2q3IjUksHpA+m/eoYtmMHfBCvAxLxfMIADlueeJOjFFY/vwDCkBbj64Bai+V/GbGW8a/rABBumUQsDx9Y3v9OtV5tsM5JaZScn4GXRNUFND0SCtNbo8BYls3xztsXvmoUkgyvrcs25jJE4IyVOJNHRjEdbAok+qT4SzGR/6om9HlKtVJVn/akrImnnQ/ihTZ3iubmurn0hIxPCYEJKEdOAaFruO617wHUyk8pI+aDs8RB0SAthVK3t9XY2iliYAAO4j1Eqm0H3EwSfviTOtR+vkIMqFNBQDRRJl1MFC6Cw8HmztZoZ9mGRIY0ryGFgM84vwC7OFFoIBuV2ZihsEO8SJJNRn93PTjnKnue8ydym40= X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?MllxbnErR1M0eXZqaS9BMUhLMGtpTzBKbWdFeU5WMi9STVZYdG9pMU9DZXYy?= =?utf-8?B?TE0xdGhvcWdVcFlhNUNJWldSWkI0LzJUZitkMnNyZjhTdW1ORnJxV0lSVzdM?= =?utf-8?B?eGJHVDlSYWZZcTRYOUhUOFkyOEprSVUvSGRTRzREQXdUZ09TeXlUTDJBM0JQ?= =?utf-8?B?bWpPR3dEdzVlZExZQU9IbncwWHd1S3dQY1lMUHFKVUh3TG1IZUNCeWIyN3Nz?= =?utf-8?B?SUZXK2d1Q0Roc1huYUV0ZEZtbVFRWW51YVVDQXpMWFY1blpHS25IZkh6SFNV?= =?utf-8?B?UkxvTGcySUZhdEFQb3k1dzdaYlg1SUtKSWFVbkl3UWgyZG40UlVrSXByU1Ru?= =?utf-8?B?WG1KeHB6cllkby9rd1FOajB0ZkxXZVJVSmpuSlhsTjZzUWh4MHNCMFA1LzV6?= =?utf-8?B?MEJBK25Lazg1dERnU1Vab0w4aUY1YkMvR29MMlJTbVoxSU0xeUlWTXRENmpX?= =?utf-8?B?N0t3ZzduQ2dCTEZ4amQ3QWRRTUJ6cGxsQVVpZU9HTGk4WWtFTGtCSmRjUEVC?= =?utf-8?B?SzBEUmVlbEFXRkh5ekFwUEFISzJCelA4U3lmcFluTFZTRnY4Y2R1ZWZmcU9o?= =?utf-8?B?N01ONVp4WWJPalRzZEd3bjVIZGR5SVpKaVp6SmhQSXRtZlo5WnJ1WkxkMDdV?= =?utf-8?B?SHVhUjQrQ0I5TjkvS1BQWitVNDhCMW9VdXkyQWtPUVB0cjU4L0RjSUp6WVgz?= =?utf-8?B?a3JOVGpUWkt0VWM0eVdoMTZOSlpIckp1THNlejBnc3YzMHFUSjdCYURwL1Rw?= =?utf-8?B?ZDFvYUp1NjRTRlVmSHRVWElIeDdiMTVYLytXQ2lTcUYyRWt4TzY5b1d3SVdn?= =?utf-8?B?R0IwMkFWblFvL0hSWXNETkc1Yk1CVkVCdDIwRzJmRVVsZmRWV2NjbGNUaGsy?= =?utf-8?B?bUprQnZOYU85U3hxMURodUxYQ2F3aCt6QzFCRVQ0UFRyZVhzYnNaRU9tUUU0?= =?utf-8?B?VjdVRkJYZzkrOGNJTXdZcFozS1pibm9xbDJtaVdISml2QTU1cmRWbGZnSzhY?= =?utf-8?B?YzAzd3JFa1JNTzhaeUhISXhodjdLTUhsQkpwc2Ywb2x5RVpSZzVHUnhUN0Nq?= =?utf-8?B?WFI1WUhJc0RncTlPcHZzQ0RSRDE0c1krQjA4TzF4cFRrcStEL2k4ejY1Wndi?= =?utf-8?B?T3A5ZjdzcS9iYVJkSmxtVGRISTVZelBsQnRMRFpOY3VqZ0c2N0dBdzFIeGlJ?= =?utf-8?B?MHRTSjNpNjU1djY3RW51di9COGR3ekJPWFVWRFErVkhGSlJ2YTVXQmhKbHBO?= =?utf-8?B?aXYzOUFISXBNL3JXRXhrSHFWYllrS0YveG0yWGlWazR2Z3pEbDRKckVBTGFj?= =?utf-8?B?OThLLzArdFNKSEZNZWF5TEE0ZUZwS21FUGlyWmtteE1LcVh4aWFVdDJQODkx?= =?utf-8?B?UnVSWlRMUjVITE9vYm1Zc2VWYkN0VDdzR1MxMUdhZGNvWWZmNjF6UTFmNUUv?= =?utf-8?B?Z0R1cTRvRkNWRDhMT1B5SFF0eGJWRlRwM3JSVUErMDgrWXVGVmhjeU9tZjJp?= =?utf-8?B?Mk54NlE5bGFXQXQ3c21CZmVnTmV5T2ExcXl3RlZqNzdBTHJKdU1qd2lNRk4z?= =?utf-8?B?MGtyZHBrVS9rWnNIalBQSEVLWGxBenQzZVRBaW4zclNNV0JZMUlDTzcwZ3hr?= =?utf-8?Q?9YzAkOGVegKTYOaa4ZjR3K8WR7Gi8dEjc2kIrWHvYfiI=3D?= X-OriginatorOrg: sct-15-20-4755-11-msonline-outlook-e8f36.templateTenant X-MS-Exchange-CrossTenant-Network-Message-Id: aefd6d55-3183-4de1-3aa9-08dc52ea6eea X-MS-Exchange-CrossTenant-AuthSource: CO1PR11MB4770.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Apr 2024 07:56:42.5611 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ2PR11MB8497 X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:8075, ipnet:40.80.0.0/12, country:US] X-Rspamd-Queue-Id: 4V80Zj6jLpz4M4M On 4/2/24 00:20, Andrea Venturoli wrote: > Hello. > > Now that 13.3 is out, and given the relatively short overlap support > window, I started upgrading my 13.2 machines as soon as I had the chance. > > However, I'm experiencing some troubles under load (in cases where every > version up to 13.2 has always worked without troubles). > > > > Scenario 1: > > Box A is ZFS/SSD based, but has an UFS HD (with only specific data) > which is exported via NFSv4. > Box B mounts that NFSv4 share and backs in up to an UFS/USB disk via rsync. > This has always worked fine until I upgraded box A to 13.3. > Now, while rsync does it jobs, box A starts crawling: Nagios reports > several failures (either daemons which die or daemons which are no > longer able to answer timely) and logging in via SSH becomes almost > impossible (with already open sessions almost unusable). > > System is on ZFS so it should not be affected by the load on the UFS HD; > besides, a single UFS HD should not be able to provide so much load to > halt an 8 core system with 32GiB or RAM. > Is it possible that such not so high network traffic (lagg with two em > cards) brings this box to an almost halt? > Unfortunately, so far I don't have any useful logs. > > > > Scenario 1: > > A box is running with several services (including two clamd instances in > two different jails). Once a week, it connects to a NAS via Bacula and > copies ~1TB of data to an external UFS HD. > As in the previous example, after I upgraded to 13.3 this simple > operation (which has worked for several years) has started to be > problematic, as daemons are killed all through it: >> Apr 1 20:01:31 xxxxxxx kernel: pid 11753 (clamd), jid 3, uid 26, was >> killed: a thread waited too long to allocate a page >> Apr 1 20:02:18 xxxxxxx kernel: pid 11720 (clamd), jid 5, uid 26, was >> killed: a thread waited too long to allocate a page >> Apr 1 20:03:16 xxxxxxx kernel: pid 3707 (squid), jid 3, uid 100, was >> killed: a thread waited too long to allocate a page >> Apr 1 20:03:54 xxxxxxx kernel: pid 7400 (zeek), jid 7, uid 782, was >> killed: a thread waited too long to allocate a page >> Apr 1 20:04:25 xxxxxxx kernel: pid 1813 (snort), jid 0, uid 0, was >> killed: a thread waited too long to allocate a page >> Apr 1 20:05:59 xxxxxxx kernel: pid 7399 (zeek), jid 7, uid 782, was >> killed: a thread waited too long to allocate a page >> Apr 1 20:05:59 xxxxxxx kernel: pid 1820 (snort), jid 0, uid 0, was >> killed: a thread waited too long to allocate a page >> Apr 1 20:06:48 xxxxxxx kernel: pid 44493 (perl), jid 5, uid 26, was >> killed: a thread waited too long to allocate a page >> Apr 1 20:07:22 xxxxxxx kernel: pid 44512 (perl), jid 5, uid 26, was >> killed: a thread waited too long to allocate a page >> Apr 1 20:09:23 xxxxxxx kernel: pid 7254 (zeek), jid 7, uid 782, was >> killed: a thread waited too long to allocate a page >> Apr 1 20:10:17 xxxxxxx kernel: pid 14462 (mysqld), jid 11, uid 88, >> was killed: a thread waited too long to allocate a page >> Apr 1 20:10:17 xxxxxxx kernel: pid 83231 (smbd), jid 8, uid 0, was >> killed: a thread waited too long to allocate a page >> Apr 1 20:10:17 xxxxxxx kernel: pid 28868 (smbd), jid 8, uid 0, was >> killed: a thread waited too long to allocate a page >> Apr 1 20:10:17 xxxxxxx kernel: pid 92611 (smbd), jid 8, uid 0, was >> killed: a thread waited too long to allocate a page >> Apr 1 20:12:20 xxxxxxx kernel: pid 77438 (clamd), jid 3, uid 26, was >> killed: a thread waited too long to allocate a page >> Apr 1 20:13:47 xxxxxxx kernel: pid 77473 (clamd), jid 5, uid 26, was >> killed: a thread waited too long to allocate a page > > Again, system/swap is on a SSD ZFS RAID pool, so disk load on the UFS > USB HD shouldn't hamper its throughput. > This time network is still a lagg, but with igb cards (so a similar > driver). > > > > Any hint what to look for? Look for kernel in arc_prune using a lot of CPU; launch top and press SH to display system processes and threads. If it is the issue, consider reverting back to 13.2, upgrading to 14, or testing patches from applying relevant patches for 13.3 as mentioned in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=275594 > Is there some known problem with LAGG, if_em/if_igb, USB, UFS, other? > > bye & Thanks > av. >