RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: enablement for ARM64 in Hyper-V (Part 3, final)

From: Souradeep Chakrabarti <schakrabarti_at_microsoft.com>
Date: Mon, 08 May 2023 12:08:19 UTC


>-----Original Message-----
>From: Kyle Evans <kevans@freebsd.org>
>Sent: Thursday, April 27, 2023 5:54 AM
>To: Souradeep Chakrabarti <schakrabarti@microsoft.com>
>Cc: Wei Hu <whu@freebsd.org>; src-committers@freebsd.org; dev-commits-src-
>all@freebsd.org; dev-commits-src-main@freebsd.org
>Subject: Re: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V:
>enablement for ARM64 in Hyper-V (Part 3, final)
>
>On Wed, Apr 26, 2023 at 4:37 PM Souradeep Chakrabarti
><schakrabarti@microsoft.com> wrote:
>>
>>
>>
>>
>> >-----Original Message-----
>> >From: Souradeep Chakrabarti
>> >Sent: Thursday, April 27, 2023 2:01 AM
>> >To: 'Kyle Evans' <kevans@freebsd.org>
>> >Cc: 'Wei Hu' <whu@freebsd.org>; 'src-committers@freebsd.org' <src-
>> >committers@freebsd.org>; 'dev-commits-src-all@freebsd.org'
>> ><dev-commits-src- all@freebsd.org>;
>> >'dev-commits-src-main@freebsd.org' <dev-commits-src-
>> >main@freebsd.org>
>> >Subject: RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V:
>> >enablement for ARM64 in Hyper-V (Part 3, final)
>> >
>> >
>> >
>> >
>> >>-----Original Message-----
>> >>From: Souradeep Chakrabarti
>> >>Sent: Wednesday, April 26, 2023 7:26 PM
>> >>To: Kyle Evans <kevans@freebsd.org>
>> >>Cc: Wei Hu <whu@freebsd.org>; src-committers@freebsd.org;
>> >>dev-commits-src- all@freebsd.org; dev-commits-src-main@freebsd.org
>> >>Subject: RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V:
>> >>enablement for ARM64 in Hyper-V (Part 3, final)
>> >>
>> >>
>> >>
>> >>
>> >>>-----Original Message-----
>> >>>From: Kyle Evans <kevans@freebsd.org>
>> >>>Sent: Wednesday, April 26, 2023 3:39 AM
>> >>>To: Souradeep Chakrabarti <schakrabarti@microsoft.com>
>> >>>Cc: Kyle Evans <kevans@freebsd.org>; Wei Hu <whu@freebsd.org>; src-
>> >>>committers@freebsd.org; dev-commits-src-all@freebsd.org;
>> >>>dev-commits-src- main@freebsd.org
>> >>>Subject: Re: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V:
>> >>>enablement for ARM64 in Hyper-V (Part 3, final)
>> [... snip ...]
>> >>>Hi,
>> >>>
>> >>>That seems odd. What happens if you bump the SYSINIT up to
>> >>>SI_SUB_SMP
>> >>>+ 1, SI_ORDER_FIRST? We don't know for a fact that all APs are
>> >>>+ ready
>> >>>for scheduling until after smp_after_idle_runnable(), which is also
>> >>>at SI_ORDER_ANY
>> >>>-- maybe there's just something going horribly wrong.
>> >>>That would perhaps explain why it's fine on a single processor
>> >>>system, which won't do anything useful (at least in later parts of
>SI_SUB_SMP).
>> >>[Souradeep]
>> >>In ARM64 SMP(VM with two cpu),  storvsc attach is happening two
>> >>times for single scsi controller.
>> >>But for intel similar VM (two cpu), it is happening once.
>> >>For the dummy/fake storvsc in arm64, we are getting stuck at device_attach.
>> >>
>> >>Details:
>> >>
>> >>vmbus_scan_done(), not getting invoked because vmbus_add_child() is
>> >>not complete for a channel 15, because of which vmbus_devtq is
>> >>having one task pending.
>> >>
>> >>Now
>> >>By passing NMI in the hung system, after examining all threads:
>> >>
>> >>sched_switch() at sched_switch+0x4dc
>> >>mi_switch() at mi_switch+0x194
>> >>sleepq_switch() at sleepq_switch+0xfc
>> >>_cv_wait() at _cv_wait+0x160
>> >>_sema_wait() at _sema_wait+0x50
>> >>storvsc_attach() at storvsc_attach+0x610
>> >>device_attach() at device_attach+0x3f8
>> >>device_probe_and_attach() at device_probe_and_attach+0x7c
>> >>vmbus_add_child() at vmbus_add_child+0x64
>> >>
>> >>Now ,
>> >>
>> >>It is stuck at waiting on sema_wait() on request->synch_sema in
>> >>hv_storvsc_channel_init() because
>> >>sema_post() on request->synch_sema is not getting invoked. Which unlocks it.
>> >>This is because we are waiting on sema_wait on synch_sema
>> >>hv_storvsc_channel_init(), for storvsc1 , but there is no storvsc1
>> >>device. So not getting a callback called for storvsc1.
>> >>
>> >>From ARM64 debug log:
>> >>If you see at line 545 again SCI device got detected.
>> >>
>> >>      Line  370: storvsc0: Enlightened SCSI device detected
>> >>      Line  371: storvsc0: <Hyper-V SCSI> on vmbus0
>> >>      Line  406: (probe0:storvsc0:0:0:0): storvsc scsi_status = 2, srb_status = 6
>> >>      Line  421: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device
>> >>      Line  436: da0: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device
>> >>      Line  443: pass1: <Msft Virtual DVD-ROM 1.0> Removable CD-ROM
>> >>SPC-3 SCSI device
>> >>      Line  447: cd0: <Msft Virtual DVD-ROM 1.0> Removable CD-ROM
>> >>SPC-3
>> >SCSI
>> >>device
>> >>      Line  545: storvsc1: Enlightened SCSI device detected
>> >>      Line  547: storvsc1: Enlightened SCSI device detected
>> >>      Line  549: storvsc1: <Hyper-V
>> >>SCSI>hv_storvsc_on_channel_callback is called
>> >>
>> >>From Log:
>> >>
>> >>unknown: device_add_child for chan15
>> >>storvsc1: Enlightened SCSI device detected
>> >>storvsc1: Enlightened SCSI device detected
>> >>storvsc1: <Hyper-V SCSI> on vmbus0
>> >>storvsc ringbuffer size: 262144, max_io: 512
>> >>storvsc1: chan15 assigned to cpu1 [vcpu1]
>> >>hn0: link state changed to UP
>> >>vmbus0: vmbus_chanmsg_handle type 0xa
>> >>storvsc1: gpadl_conn(chan15) succeeded
>> >>vmbus0: vmbus_chanmsg_handle type 0x6
>> >>storvsc1: chan15 opened
>> >>waiting on sema wait synch_sema hv_storvsc_channel_init
>> >[Souradeep] The fix is working, the test bed had an issue, after
>> >fixing that the fix is working.
>> >I will share the fix by this week.
>> [Souradeep] Small update, the problem happens only if there is an
>> extra SCSI controller on the system. Then it fails to attach storvsc for that SCSI.
>>
>
>Excellent! Knowing now what configuration causes it; does it reproduce on x86 as
>well with an extra SCSI controller? I'd expect so, but maybe
[Souradeep] In x86 this problem is not seen.
After doing more detailed debugging, looks like the interrupt coming to CPU1
are not getting handled or CPU1 not getting the interrupt in IRQ 18,, which is used by
Hyper-V to notify guest on incoming message on any channel.
I checked vmstat -I, and it seems vmbus using gic0, p2 in amr64.

It looks to me, vmbus intr handler not getting called for CPU1 if IRQ is coming to CPU1
irq 18.

Do we have anything in FreeBSD arm64 to enable an IRQ in every CPU?

# vmstat -i
interrupt                                             total       rate
gic0,p2: vmbus0                                        3820         49
gic0,p4:-ric_timer0                                    2643         34
gic0,s1: uart0                                         2990         38
cpu0:ast                                                  1          0
cpu1:ast                                                  4          0
cpu0:preempt                                           3913         50
cpu1:preempt                                           4890         62
cpu0:rendezvous                                           2          0
cpu1:rendezvous                                           4          0
cpu0:hardclock                                            1          0
Total                                                 18268        232
>not-
>
>Thanks,
>
>Kyle Evans