Ryzen lockup on bhyve was (Re: new Ryzen lockup issue ?)
Mike Tancsa
mike at sentex.net
Fri Feb 23 20:22:34 UTC 2018
Actually I can confirm the same sort of hard lockup happens on my Epyc
board with RELENG11. It also happens in current. I will file a PR and
post on freebsd-current in case someone has any suggestions on how to
try and figure out whats going on.
I upgraded the box to
12.0-CURRENT #0 r329866
in order to see if it could avoid the lockup, but same deal. The vmm
driver does seem different when loaded, but the same lock up under load
CPU: AMD Ryzen 5 1600X Six-Core Processor (3593.35-MHz
K8-class CPU)
Origin="AuthenticAMD" Id=0x800f11 Family=0x17 Model=0x1 Stepping=1
Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
Features2=0x7ed8320b<SSE3,PCLMULQDQ,MON,SSSE3,FMA,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM>
AMD
Features2=0x35c233ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,SKINIT,WDT,TCE,Topology,PCXC,PNXC,DBE,PL2I,MWAITX>
Structured Extended
Features=0x209c01a9<FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA>
XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
AMD Extended Feature Extensions ID EBX=0x7<CLZERO,IRPerf,XSaveErPtr>
SVM: NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
TSC: P-state invariant, performance statistics
AMD-Vi: IVRS Info VAsize = 64 PAsize = 48 GVAsize = 2 flags:0
driver bug: Unable to set devclass (class: ppc devname: (unknown))
ivhd0: <AMD-Vi/IOMMU ivhd with EFR> on acpi0
ivhd0: Flag:b0<IotlbSup,Coherent>
ivhd0: Features(type:0x11) MsiNumPPR = 0 PNBanks= 2 PNCounters= 0
ivhd0: Extended features[31:0]:22294ada<PPRSup,NXSup,GTSup,IASup> HATS =
0x2 GATS = 0x0 GLXSup = 0x1 SmiFSup = 0x1 SmiFRC = 0x2 GAMSup = 0x1
DualPortLogSup = 0x2 DualEventLogSup = 0x2
ivhd0: Extended features[62:32]:f77ef<USSup> Max PASID: 0x2f
DevTblSegSup = 0x3 MarcSup = 0x1
ivhd0: supported paging level:7, will use only: 4
ivhd0: device range: 0x0 - 0xffff
ivhd0: PCI cap 0x190b640f at 0x40 feature:19<IOTLB,EFR,CapExt>
On 2/23/2018 12:35 PM, Nimrod Levy wrote:
> Now that is a fascinating data point. My machine that I've been having
> issues with has been running a bhyve vm from the beginning. I never
> made the connection. I'll try throwing some network traffic at the VM
> and see if I can make it lock up.
>
> On Fri, Feb 23, 2018 at 10:14 AM, Mike Tancsa <mike at sentex.net
> <mailto:mike at sentex.net>> wrote:
>
> On 2/22/2018 3:41 PM, Mike Tancsa wrote:
> > On 2/21/2018 3:04 PM, Mike Tancsa wrote:
> >> Not sure if I have found another issue specific to Ryzen, or a bug that
> >> manifests itself on Ryzen systems easier. I installed the latest
> >> virtualbox from the ports and was doing some network performance tests
> >> between a vm and the hypervisor using iperf3. The guest is just a
> >> RELENG11 image and the network is an em nic bridged to epair1b
> >
> > This looks possibly related to VirtualBox. Doing the same tests and more
> > using bhyve, I dont get any lockup. Not to mention, network IO is MUCH
> > faster.
>
>
> Actually, it just took a little bit longer to lock up the box with bhyve
> on RELENG_11 as the hypervisor. Would be great if anyone can confirm
> this locks up their Ryzen boxes ? I tried 2 different boxes to eliminate
> a hardware issue. Also tried a similar test on Ubuntu and I can spin up
> 4 instances and run without lockups.
>
> Just grab a copy of
>
> https://download.freebsd.org/ftp/releases/VM-IMAGES/11.1-RELEASE/amd64/Latest/FreeBSD-11.1-RELEASE-amd64.raw.xz
> <https://download.freebsd.org/ftp/releases/VM-IMAGES/11.1-RELEASE/amd64/Latest/FreeBSD-11.1-RELEASE-amd64.raw.xz>
>
> and make 2 copies. tmp.raw and tmp2.raw
>
>
> kldload vmm
> ifconfig tap0 create
> ifconfig tap1 create
> ifconfig tap1 up
> ifconfig tap0 up
> ifconfig bridge0 create addm tap0 addm tap1
> ifconfig bridge0 192.168.99.1/24 <http://192.168.99.1/24>
>
> screen -d -m sh /usr/share/examples/bhyve/vmrun.sh -c 4 -m 6144M -t tap0
> -d tmp.raw BSD11a
> screen -d -m sh /usr/share/examples/bhyve/vmrun.sh -c 4 -m 6144M -t tap1
> -d tmp2.raw BSD11b
>
> Install netperf on the 2 vms and give the vtnet interface
> 192.168.99.2/24 <http://192.168.99.2/24> and 192.168.99.3/24
> <http://192.168.99.3/24>
>
> In both VMs pkg install iperf3 and start it up as
> iperf -s
>
> In the hypervisor,
> iperf -t 10000 -R -c 192.168.99.2
> iperf -t 10000 -c 192.168.99.3
>
>
> the box locks up solid after 5-20 min. Same hardware with Ubuntu and
> virtual box and 4 instances work fine, no lockups after a day so not
> sure whats up but it seems to be something with the Ryzen CPU running as
> a hypervisor or with some type of load :(
>
> Prior to lockup I had a stream of netstat -m writing to a file every 5
> seconds. The last entry was below. It doesnt seem to be leak.
>
> Thu Feb 22 17:14:28 EST 2018
> 8694/10281/18975 mbufs in use (current/cache/total)
> 8225/5211/13436/2038424 mbuf clusters in use (current/cache/total/max)
> 8225/5184 mbuf+clusters out of packet secondary zone in use
> (current/cache)
> 461/3747/4208/1019211 4k (page size) jumbo clusters in use
> (current/cache/total/max)
> 0/0/0/301988 9k jumbo clusters in use (current/cache/total/max)
> 0/0/0/169868 16k jumbo clusters in use (current/cache/total/max)
> 20467K/27980K/48447K bytes allocated to network (current/cache/total)
> 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
> 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> 0 sendfile syscalls
> 0 sendfile syscalls completed without I/O request
> 0 requests for I/O initiated by sendfile
> 0 pages read by sendfile as part of a request
> 0 pages were valid at time of a sendfile request
> 0 pages were requested for read ahead by applications
> 0 pages were read ahead by sendfile
> 0 times sendfile encountered an already busy page
> 0 requests for sfbufs denied
> 0 requests for sfbufs delayed
>
>
>
> ---Mike
>
>
>
>
> --
> -------------------
> Mike Tancsa, tel +1 519 651 3400 x203
> <tel:%2B1%20519%20651%203400%20x203>
> Sentex Communications, mike at sentex.net <mailto:mike at sentex.net>
> Providing Internet services since 1994 www.sentex.net
> <http://www.sentex.net>
> Cambridge, Ontario Canada
> _______________________________________________
> freebsd-stable at freebsd.org <mailto:freebsd-stable at freebsd.org>
> mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> <https://lists.freebsd.org/mailman/listinfo/freebsd-stable>
> To unsubscribe, send any mail to
> "freebsd-stable-unsubscribe at freebsd.org
> <mailto:freebsd-stable-unsubscribe at freebsd.org>"
>
>
--
-------------------
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, mike at sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
More information about the freebsd-stable
mailing list