Server doesn't boot when 3 PCIe slots are populated

Valeri Galtsev galtsev at kicp.uchicago.edu
Mon Jan 15 16:09:56 UTC 2018


On Mon, January 15, 2018 12:44 am, Grzegorz Junka wrote:
>
> On 15/01/2018 06:18, Warner Losh wrote:
>>
>>
>> On Jan 14, 2018 11:05 PM, "Grzegorz Junka" <list1 at gjunka.com
>> <mailto:list1 at gjunka.com>> wrote:
>>
>>
>>     On 14/01/2018 16:18, Mehmet Erol Sanliturk wrote:
>>
>>
>>
>>         On Sun, Jan 14, 2018 at 5:46 PM, Grzegorz Junka
>>         <list1 at gjunka.com <mailto:list1 at gjunka.com>
>>         <mailto:list1 at gjunka.com <mailto:list1 at gjunka.com>>> wrote:
>>
>>
>>             On 13/01/2018 17:56, Mehmet Erol Sanliturk wrote:
>>
>>
>>
>>                 On Sat, Jan 13, 2018 at 7:21 PM, Grzegorz Junka
>>                 <list1 at gjunka.com <mailto:list1 at gjunka.com>
>>         <mailto:list1 at gjunka.com <mailto:list1 at gjunka.com>>
>>                 <mailto:list1 at gjunka.com <mailto:list1 at gjunka.com>
>>         <mailto:list1 at gjunka.com <mailto:list1 at gjunka.com>>>> wrote:
>>
>>                     Hello,
>>
>>                     I am installing a FreeBSD server based on
>>         Supermicro H8SML-iF.
>>                     There are three PCIe slots to which I
>> installed 2 NVMe
>>                 drives and
>>                     one network card Intel I350-T4 (with 4
>> Ethernet
>>         slots).
>>
>>                     I am observing a strange behavior where the
>> system
>>         doesn't
>>                 boot if
>>                     all three PCIe slots are populated. It shows
>> this
>>         message:
>>
>>                     nvme0: <Generic NVMe Device> mem
>>         0xfd8fc000-0xfd8fffff irq
>>                 24 at
>>                     device 0.0 on pci1
>>                     nvme0: controller ready did not become 1
>> within
>>         30000 ms
>>                     nvme0: did not complete shutdown within 5
>> seconds of
>>                 notification
>>
>>                     The I see a kernel panic/dump and the system
>>         reboots after
>>                 15 seconds.
>>
>>                     If I remove one card, either one of the NVMe
>>         drives or the
>>                 network
>>                     card, the system boots fine. Also, if in BIOS
>> I
>>         set PnP OS
>>                 to YES
>>                     then sometimes it boots (but not always). If I
>> set
>>         PnP OS
>>                 to NO,
>>                     and all three cards are installed, the system
>>         never boots.
>>
>>                     When the system boots OK I can see that the
>>         network card is
>>                     reported as 4 separate devices on one of the
>> PCIe
>>         slots. I
>>                 tried
>>                     different NVMe drives as well as changing
>> which
>>         device is
>>                     installed to which slot but the result seems
>> to be the
>>                 same in any
>>                     case.
>>
>>                     What may be the issue? Amount of power drawn
>> by the
>>                 hardware? Too
>>                     many devices not supported by the motherboard?
>> Too
>>         many
>>                 interrupts
>>                     for the FreeBSD kernel to handle?
>>
>>                     Any help would be greatly appreciated.
>>
>>                     GregJ
>>
>>                    
>> _______________________________________________
>>
>>
>>
>>
>>
>>                 From my experience from other trade marked main
>> boards
>>         , an
>>                 action may be to check manual of your server board
>> to see
>>                 whether there are rules about use of these slots :
>>         Sometimes
>>                 differently shaped slots are supplied with same
>> ports
>>         : If one
>>                 slot is occupied , the other slot should be left
>> open , or
>>                 rules about not to insert such a kind of device into
>> a
>>         slot ,
>>                 for example , graphic cards .
>>
>>
>>                 Mehmet Erol Sanliturk
>>
>>
>>             I checked the manual but couldn't find any restrictions
>>         regarding
>>             PCIe ports. It only says how many lanes are available in
>> each
>>             slot. Would there be any obvious BIOS setting that could
>> cause
>>             this issue? I tried after resetting BIOS to default
>>         settings but
>>             maybe something is set incorrectly by default?
>>
>>             GregJ
>>             _______________________________________________
>>
>>
>>
>>
>>
>>         http://www.supermicro.com/Aplus/motherboard/Opteron3000/SR56x0/H8SML-iF.cfm
>>         <http://www.supermicro.com/Aplus/motherboard/Opteron3000/SR56x0/H8SML-iF.cfm>
>>         H8SML-iF
>>
>>
>>         On the above page , click "OS Compatibility"
>>
>>
>>         On the following page , click "SR5650"
>>
>>         http://www.supermicro.com/Aplus/support/resources/OS/OS_Comp_SR5650.cfm
>>         <http://www.supermicro.com/Aplus/support/resources/OS/OS_Comp_SR5650.cfm>
>>         OS Compatibility Chart
>>
>>
>>         On the column ( third )
>>
>>         H8SML-7F
>>         H8SML-7
>>         H8SML-iF
>>         H8SML-i
>>
>>
>>         there listed only *
>>         *
>>         **
>>         *
>>         *
>>         *
>>         *
>>
>>         FreeBSD 8.0
>>         FreeBSD 9.1
>>
>>         From this list , it may be said that , this mother board date
>>         is old , means , it seems that the new OS versions are not
>>         tested after currently tested OS versions .
>>
>>
>>         To check interaction between operating system and your
>>         Supermicro H8SML-iF , select one of the suitable operating
>>         system ( Unix class OSes are more suitable ) for you and
>>         tested on this card , and try to install it as you like your
>>         installed components . If it boots successfully , it means
>>         that there is an incompatibility between your FreeBSD and the
>>         main board . If no one of them boots , then you may conclude
>>         that , there is a problem in your settings .
>>
>>
>>         BIOS settings are important , because , OS communicates with
>>         the main board through these settings .
>>
>>
>>         In manual ( downloaded from the above page :
>>         Manual Revision 1.0c
>>         Release Date: March 12, 2014 ) , page 4-9  , "PCI/PnP
>>         Configuration" is defined .
>>         If PnP is selected YES. OS adjusts some device settings  . If
>>         NO is selected , BIOS adjusts some device settings . When BIOS
>>         adjusted device settings are not conforming to OS parameters ,
>>         the result will be "FAIL" .
>>
>>         Therefore , more suitable selection is YES .
>>
>>
>>         Another point is that , there are many more BIOS selectable
>>         parameters and jumpers about PCI slots and others  .
>>         There are some BIOS settings for PCI slots :
>>
>>         PCI X4 Slot 6 ( page 4-9 )
>>         PCI x8 Slot 7 ( page 4-10 )
>>
>>
>>
>>         Please review these BIOS settings in your manual and set them
>>         with respect to your requirements .
>>
>>
>>     Thanks Mehmet for looking into this. It's an old motherboard but
>>     my point is that it boots fine when either: one NVMe and the
>>     network card, or both NVMe are installed, but not when all three
>>     are installed. How would that be related to FreeBSD compatibility?
>>     The chipset and all devices that I am trying to install are
>>     supported by FreeBSD 11.x.
>>
>>     I just tried booting into a Debian live system and it also didn't
>>     enumerate NVMe drives properly. This means that it's not FreeBSD
>>     related and is no longer relevant for this list. I will try to
>>     play with BIOS settings to see if I can make it work that way.
>>     Thanks for all the help.
>>
>>
>>
>> Nvme drives are weird about power. I distrust the power estimate of
>> 5-9w earlier in the thread... given the oddity with debian, it's not
>> too crazy to think that. How far does FreeBSD boot though?
>>
>
> I tried with a different power supply but the outcome was exactly the
> same.

I assume different power supply is spec'ed at some 30% higher power.
Otherwise this is inconclusive.

> Sometimes FreeBSD boots fine but one of the NVMe drives is not
> visible (i.e. dmesg grep shows only one NVMe). When it doesn't work it
> boots up to the point of enumerating drives (SATA, USB, NVMe). Then it
> stops at the first NVMe and reboots.

It sounds like without regard to third card, only two NVMe cards when they
both are plugged in do cause this problem. Am I right? If it is not so,
i.e. two NVMe cards in the absence of third card do work, then it may have
something to do with usage of PCI address space and inability to allocate
such for whatever reason.

Anyway at this point I would try to experiment more attempting to boot off
live CDs/DVDs with different systems in configuration in which FreeBSD has
problem. Linux (Debian, CentOS, Fedora, Ubuntu, Knoppix - any one of them
will probably speak for all of them, as it is Linux kernel...), OpenBSD,
NetBSD, MS Windows (you don't have to install the last, or have license or
register it, just test if you can boot it off installation or recovery
disk).

Good luck!

Valeri

>
> The funny thing is that very often it's enough to pull out one of the
> cards and put it back in. Then the system boots fine with all three
> cards. I had that a few times. Once it's booted it works, I can restart
> the system and it boots every time. As soon as I power off, unplug from
> the power main, wait a few minutes and power it on again, the issue
> comes back - can't boot as NVMe can't be enumerated.
>
> I though it might be caused by the hardware being too cold. I left the
> server once overnight but it didn't boot up, it was trying and
> restarting the whole night.
>
> GregJ
>
>
> _______________________________________________
> freebsd-questions at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to
> "freebsd-questions-unsubscribe at freebsd.org"
>


++++++++++++++++++++++++++++++++++++++++
Valeri Galtsev
Sr System Administrator
Department of Astronomy and Astrophysics
Kavli Institute for Cosmological Physics
University of Chicago
Phone: 773-702-4247
++++++++++++++++++++++++++++++++++++++++


More information about the freebsd-questions mailing list