[Bug 269823] rand_harvest produces 100%CPU on 1 CPU with virtio_random.ko loaded

From: <bugzilla-noreply_at_freebsd.org>
Date: Thu, 02 Mar 2023 00:17:37 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=269823

--- Comment #5 from Mina Galić <freebsd@igalic.co> ---
been debugging this on and off for the past 2 days.
at first I thought that

dtrace -n 'kinst::virtqueue_poll:' output was not not very useful, until I
realized that it constantly repeats:

  1  73355               virtqueue_poll:112 
  1  73356               virtqueue_poll:115 
  1  73357               virtqueue_poll:118 
  1  73358               virtqueue_poll:124 
  1  73359               virtqueue_poll:129 
  1  73360               virtqueue_poll:133 
  1  73361               virtqueue_poll:135 
  1  73362               virtqueue_poll:137 
  1  73363               virtqueue_poll:141 
  1  73364               virtqueue_poll:144 
  1  73365               virtqueue_poll:151 
  1  73366               virtqueue_poll:155 
  1  73367               virtqueue_poll:158 
  1  73355               virtqueue_poll:112 
  1  73356               virtqueue_poll:115 
  1  73357               virtqueue_poll:118 
  1  73358               virtqueue_poll:124 
  1  73359               virtqueue_poll:129 
  1  73360               virtqueue_poll:133 
  1  73361               virtqueue_poll:135 
  1  73362               virtqueue_poll:137 
  1  73363               virtqueue_poll:141 
  1  73364               virtqueue_poll:144 
  1  73365               virtqueue_poll:151 
  1  73366               virtqueue_poll:155 
  1  73367               virtqueue_poll:158 


etc…
This function never returns, we don't even get (back) into vtrnd_read:

dtrace -n 'kinst::vtrnd_read:'
dtrace: description 'kinst::vtrnd_read:' matched 88 probes
… # this never prints anything on this machine

So let's look at those instructions:

0xffffffff80a113e0 <+112>:   48 89 df        mov    %rbx,%rdi
0xffffffff80a113e3 <+115>:   ff 50 08        call   *0x8(%rax)
0xffffffff80a113e6 <+118>:   41 0f b7 4c 24 56       movzwl 0x56(%r12),%ecx
0xffffffff80a113ec <+124>:   49 8b 44 24 48  mov    0x48(%r12),%rax
0xffffffff80a113f1 <+129>:   66 3b 48 02     cmp    0x2(%rax),%cx
0xffffffff80a113f5 <+133>:      75 32   jne    0xffffffff80a11429
<virtqueue_poll+185>
0xffffffff80a113f7 <+135>:      f3 90   pause  
0xffffffff80a113f9 <+137>:      49 8b 1c 24     mov    (%r12),%rbx
0xffffffff80a113fd <+141>:      48 8b 0b        mov    (%rbx),%rcx
0xffffffff80a11400 <+144>:      0f b6 15 59 c0 0c 01    movzbl #
0x10cc059(%rip),%edx        # 0xffffffff81add460 <virtio_bus_poll_desc>
0xffffffff80a11407 <+151>:      48 8b 04 d1     mov    (%rcx,%rdx,8),%rax
0xffffffff80a1140b <+155>:      4c 39 38        cmp    %r15,(%rax)
0xffffffff80a1140e <+158>:      74 d0   je     0xffffffff80a113e0
<virtqueue_poll+112>

in other, C words:

void *
virtqueue_poll(struct virtqueue *vq, uint32_t *len)
{
        void *cookie;

        VIRTIO_BUS_POLL(vq->vq_dev);
        while ((cookie = virtqueue_dequeue(vq, len)) == NULL) {
                cpu_spinwait();
                VIRTIO_BUS_POLL(vq->vq_dev);
        }

        return (cookie);
}


(if *0x8(%rax) is an obfuscated call to virtqueue_dequeue() then…)

This is literally just the inner loop here.

That means: virtqueue_dequeue() never returns anything other than NULL.

Why, i don't know yet. There are several virtqueue_dequeue()s running on the
system, according to dtrace, but none of them seem to be running on CPU1, like
our virtqueue_poll().

my assumption here is

rand_harvest running CPU1 (true) → virtqueue_poll() running on CPU1 (true)
Therefore, our virtqueue_dequeue() should be running on CPU1 as well.

But either i'm not seeing it on CPU1, in fact it's running on all other CPUs
exccept CPU1. Either that, or I don't know how to filter for it, but this
doesn't work:

root@freebsd:~ # dtrace -n 'kinst::virtqueue_dequeue: /cpu == 1/'
dtrace: invalid probe specifier kinst::virtqueue_dequeue: /cpu == 1/: syntax
error near end of input

-- 
You are receiving this mail because:
You are on the CC list for the bug.