From nobody Sun Jun 04 14:21:28 2023 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4QYzST0XCBz4Z2cH for ; Sun, 4 Jun 2023 14:21:29 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4QYzSS4fLCz3QFV for ; Sun, 4 Jun 2023 14:21:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1685888488; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=6RwFeC9TPyQA4xjhvAB3hopWYy4n4VbbF5dTaD+0R9k=; b=MdYEH9NT6LNHyC1OVF0cMyrFKWSdqWhTM2zEIVEvc+VYjpil5shZKNO9vlsLP1Ej9hAcfF 4cdwvErFxMjvIqwuwxzHuILB6ZNpp1lz4WbU/FvnzeM9d0hTyOo1PMG3USD7wIl+03VBZW J1cUPx8MtXTmPexVrCXNUvbh4NBqOuLZHpTXlrVo+sr0h+o2m8srckxQABnH2uiAQytESV QhLnzmHWgOc88SAPsx+Th9DIbV71h8TZSyqOYOz14JJs3SaiA/MhGxxS3OaFPoTSGnL9p2 tolYqBZaOXH+yYAlxEsm9i0AAiU8JIrNozYQiMyrDS3dVgoRiWTVFwd+eAETug== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1685888488; a=rsa-sha256; cv=none; b=ac3qQf1y02EiaQLQcrQEEMk9Z5+kGX2S9atqc32595Fgg+mj1eSKTmePFmC69UqcuOyipB g7NHPyQqMl/AZaXhv9Oa93nvTv65ikZwtPdaDGacs23XyW7VqyNozKPMJbecUvl/dRaR0I Llev4/CVax0ZD0HfxZwZNXX7UY6H+C1dXvrF8jBfCzPa3A5cvgythvjqnbc0g8sbIkUSwM DXKBSKCv3F1AzBMbb7k9kKfZYMVDRMdWibxtsFsSpQBmAoj5G1FEOLe0nI541B+wdJNuSs lXC2QyLXQcZyYO00AwbekBT5Wcx70wBEZ8mf5AtIffGWMS0xoe46lLCl96oJsw== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4QYzSS3fXgz196y for ; Sun, 4 Jun 2023 14:21:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 354ELSGI022805 for ; Sun, 4 Jun 2023 14:21:28 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 354ELSKW022804 for bugs@FreeBSD.org; Sun, 4 Jun 2023 14:21:28 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 271819] FREEBSD 13.0 machine becomes unresponsive after some days. Date: Sun, 04 Jun 2023 14:21:28 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: bin X-Bugzilla-Version: Unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: anwarcse47us@gmail.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@freebsd.org MIME-Version: 1.0 X-ThisMailContainsUnwantedMimeParts: N https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D271819 Bug ID: 271819 Summary: FREEBSD 13.0 machine becomes unresponsive after some days. Product: Base System Version: Unspecified Hardware: amd64 OS: Any Status: New Severity: Affects Some People Priority: --- Component: bin Assignee: bugs@FreeBSD.org Reporter: anwarcse47us@gmail.com Hi, We have recently upgraded our virtual machines to freebsd 13.0 from freebsd 10.4. We have seen that machine becomes unresponsive after running for some days. Sometimes, it is becoming unresponsive after 2 days and sometimes it is tak= ing upto 2-3 months before becoming unresponsive. VM configuration: number of CPU: 2 RAM: 6GB We have observed, just before machine becoming unresponsive, access to one = of the directory is getting hung means: Let's say I have a directory /x/y/z/outbox. any command from shell trying to access the directory outbox is going into Uninterrupted sleep state and she= ll is getting hung.=20 There are DU processes, accessing outbox directory, run at certain intervals are getting stuck in UFS state as seen in TOP command. 34877 root 1 20 0 28M 2952K ufs 0 0:01 0.00% du 75703 root 1 20 0 28M 2628K ufs 1 0:01 0.00% du 6753 root 1 20 0 28M 2980K ufs 0 0:01 0.00% du 87132 root 1 20 0 28M 2980K ufs 0 0:01 0.00% du 63429 root 1 20 0 28M 2972K ufs 1 0:01 0.00% du 18308 root 1 20 0 28M 2652K ufs 1 0:01 0.00% du 18074 root 1 20 0 28M 2580K ufs 0 0:01 0.00% du 88042 root 1 20 0 27M 2992K ufs 0 0:01 0.00% du 82363 root 1 20 0 27M 2996K ufs 0 0:01 0.00% du There is a JAVA process, that reads data from outbox directory, has also got into STOP state and we are unable to even kill that Java process. In other occurrences of this issue, we have seen a custom python process got stuck consuming 100% CPU and its input directory was seen getting repaired during reboot. This problem is getting resolved after reboot. During reboot we can see that fsck command is being run to correct that directory. reboot logs: Sometimes DIR becoming UNREF: kernel: DIR I=3D2964556 CONNECTED. PARENT WAS I=3D2964003 kernel: kernel: UNREF DIR I=3D2964499 OWNER=3Droot MODE=3D40755 kernel: SIZE=3D2048 MTIME=3DMay 26 13:50 2023 kernel: kernel: RECONNECT? yes .... .... kernel: ***** FILE SYSTEM STILL DIRTY ***** kernel: ***** FILE SYSTEM WAS MODIFIED ***** ***** PLEASE RERUN FSCK ***** ... ... In another occurrence: Parent DIR had wrong link count. Logs during reboot: kernel: /dev/da0p9: LINK COUNT DIR I=3D700908 OWNER=3Droot MODE=3D40777 kernel: /dev/da0p9: SIZE=3D512 MTIME=3DJun 2 09:05 2023 COUNT 3 SHOULD BE 2 (ADJUSTED) These symptoms are similar to freebsd bug report: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D224292 So we tried patching : https://cgit.freebsd.org/src/commit/sys/ufs/ffs/ffs_softdep.c?id=3D50acaaef= 54b4d7811393eb8c05a398d7a1882418 and also added a logic to run sync as mentioned in the bug #224292. But not= hing worked. We have observed that this is happening only on machine having 2 CPU cores = and 6 GB ram. It is not happening on machine with greater number of CPU cores a= nd RAM. Thanks in advance... --=20 You are receiving this mail because: You are the assignee for the bug.=