From nobody Sat May 28 07:02:23 2022 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 690091B3AE28 for ; Sat, 28 May 2022 07:02:26 +0000 (UTC) (envelope-from paulf2718@gmail.com) Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4L9CKY6Cbcz3lvx for ; Sat, 28 May 2022 07:02:25 +0000 (UTC) (envelope-from paulf2718@gmail.com) Received: by mail-wm1-x32a.google.com with SMTP id f23-20020a7bcc17000000b003972dda143eso5684593wmh.3 for ; Sat, 28 May 2022 00:02:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language :from:to:references:in-reply-to:content-transfer-encoding; bh=D+abtwZqkQFZ9VTdUudpj9V1NEzEX1qFOhwmCQ98XnE=; b=LlmFNngwTZoqMP6bukzkXnBCyDNTszjxGSbAutDRDGj/+uvfFzXCj590esltda/DLL hXSfuqu8xqGVQd4TNUIHXnv2Dh7MBMZXOTcXe0tst0RQZboMRXue9pOjdEfOC7GQC3rQ R0R7vjJvdBbhcvVmFIY7aYhvP0JUIeybv50du/w8UE1KrJudWcqMrSJJNpX5n1/sWQ5X qcwY1yYR4NKP8HTCEkJyIUjcIBkNFEhayYFJQWmq3nga/PJYt0sEcwSlPRA8tpZr/TvC pD6FdPhIpUHcolz3l4n0jEK1KWwDXLlwHM3DnYCN9Q/Epx5VxNDytUL7zSaVznmXEFSI NgCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:from:to:references:in-reply-to :content-transfer-encoding; bh=D+abtwZqkQFZ9VTdUudpj9V1NEzEX1qFOhwmCQ98XnE=; b=ZvjwYVITFZqiO9sxvTDKBfnksqlIHukyilAxxWTMG+wt6tutB+SzNG07gL3qearw9o 4VdSOzl3KJIxMUyp1iPqv+AMvQDTvB0YWeK49L9X5NdECxBK25ZglaSyyI5IxLTN37Mf h+t1Ix80vzmqyA8lcp4Qk3dLVUtss8nh1hgSaQntncn2ol7PFiN8/0epJ5udMD65Snt5 uSWtiSLABlVoNTM0YcJ9P+ghTCc6lgJSFx3qBlMdhmw6vHbbyP7R7PtZrG62cjOz6LTI vBWOS7dirLdeQArD7wJbZ8RQuNOg+t963XvKorusF896d/TLnQ2egSGACCs6K09mRKoS 5A+w== X-Gm-Message-State: AOAM531d/8JOnyPA0jcmvUUAB+lTy+fBBe5qY9VpayvgA8tXDovdxvVT k9mUxC53Y/mI/nGEkYFrYDSxYTIcmJE= X-Google-Smtp-Source: ABdhPJxpRmsb3qiYrf8xV7RCPAPABJ9bjGpQPm6LZ5StH4V6SPhQJw/Ss7afl9lMYAZVFi9wNgwZ6A== X-Received: by 2002:a05:600c:3549:b0:397:8f09:1abe with SMTP id i9-20020a05600c354900b003978f091abemr4559152wmq.107.1653721344777; Sat, 28 May 2022 00:02:24 -0700 (PDT) Received: from [192.168.1.28] (lfbn-lyo-1-398-93.w2-7.abo.wanadoo.fr. [2.7.225.93]) by smtp.gmail.com with ESMTPSA id o11-20020a1c750b000000b00397550b387bsm4695134wmc.23.2022.05.28.00.02.23 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 28 May 2022 00:02:24 -0700 (PDT) Message-ID: Date: Sat, 28 May 2022 09:02:23 +0200 List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: Hang ast / pipelk / piperd Content-Language: en-US From: Paul Floyd To: FreeBSD Hackers References: <84015bf9-8504-1c3c-0ba5-58d0d7824843@gmail.com> In-Reply-To: <84015bf9-8504-1c3c-0ba5-58d0d7824843@gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 4L9CKY6Cbcz3lvx X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20210112 header.b=LlmFNngw; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of paulf2718@gmail.com designates 2a00:1450:4864:20::32a as permitted sender) smtp.mailfrom=paulf2718@gmail.com X-Spamd-Result: default: False [-2.62 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; RECEIVED_SPAMHAUS_PBL(0.00)[2.7.225.93:received]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.98)[-0.977]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20210112]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-0.999]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_SPAM_SHORT(0.36)[0.360]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::32a:from]; MLMMJ_DEST(0.00)[freebsd-hackers]; RCVD_TLS_ALL(0.00)[] X-ThisMailContainsUnwantedMimeParts: N On 5/27/22 22:13, Paul Floyd wrote: > > Hi > > I'm debugging two issues with Valgrind on FreeBSD 13.1 and 14, one on > amd64 and one on i386. > ... > |Both hangs seem quite sensitive to timing - in both cases adding or > changing nanosleep times seem to make them no longer hang. | > |Adding debug statements to Valgrind can also change the behaviour > (and is also unsafe when not holding the scheduler lock). Does this > look like a kernel bug? | |One important detail I missed out. Why is Valgrind releasing the scheduler lock?| | | |To make a client syscall. This needs to be done in "client-like" circumstances - specifically, with the client signal mask (rather than the Valgrind mask, which is to mask all signals so that Valgrind has full control).| |Two things can happen with a client syscall.| |1/ it succeeds, and Valgrind will re-acquire the lock and continue.| |2/ it gets interrupted, Valgrind re-acquires the lock, does a load of stuff to fixup the guest state and take the appropriate action (restart, return EINTR, save carry etc).| | | |I did think that 2/ might be prone to get into an infinite loop, especially with restart. But I don't see anything like that in the Valgrind logs.| PJF thread 14 making a client nanosleep syscall |SYSCALL[5379,14](240) sys_nanosleep ( 0x200890, 0x0 ) --> [async] ... | |PJF -thread 14 releases the scheduler lock --5379--   SCHED[14]: releasing lock (VG_(client_syscall)[async]) -> VgTs_WaitSys | |PJF thread 2 acquires the scheduler lock --5379--   SCHED[2]:  acquired lock (VG_(client_syscall)[async]) || | |PJF thread 2 return from nanosleep SYSCALL[5379,2](240) ... [async] --> Success(0x0) PJF thread 2 making a client write syscall SYSCALL[5379,2](  4) sys_write ( 1, 0x4ea9000, 48 ) --> [async] ... PJF thread 2 releases the scheduler lock --5379--   SCHED[2]: releasing lock (VG_(client_syscall)[async]) -> VgTs_WaitSys PJF this is the thread 2 printf from syscall write tls_ptr: case "race" has mismatch: *ip=8 here=4 PJF thread 2 acquires the scheduler lock --5379--   SCHED[2]:  acquired lock (VG_(client_syscall)[async]) PJF thread 2 return from write (30 bytes written) SYSCALL[5379,2](  4) ... [async] --> Success(0x30) PJF thread 2 making a client nanosleep syscall SYSCALL[5379,2](240) sys_nanosleep ( 0x200890, 0x0 ) --> [async] ... PJF thread 2 releases the scheduler lock --5379--   SCHED[2]: releasing lock (VG_(client_syscall)[async]) -> VgTs_WaitSys | |And that's it, it hangs making the client nanosleep syscall.| | | |A+| |Paul | ||