PERC5 (LSI MegaSAS) Patrol Read crashes

Simon simon at optinet.com
Wed Nov 14 04:22:10 PST 2007


Do you guys perform consistency checks on your RAID5
or other redundant arrays?

-Simon


On Wed, 14 Nov 2007 10:09:55 +0000, Jason Thomson wrote:

>We have ~20 Dell servers with 6.x and Perc5.

>Automatic PR is enabled on all of them,  and happens once a week
>without any problems (so far - since July).

>Obviously,  from the thread,  manual Patrol Reads cause a definite
>problem,  but it's not clear that the initial problem had the same
>symptoms - just that there was a correlation between the automatic
>Patrol Reads and the server lockup?

>I guess it may be that under heavy load (exacerbated by PR) something
>else gets screwed up and causes the machine lockup?






>Simon wrote:

>> So everyone that uses Dell servers with Perc5 and 6.x disables automatic PR?
>> 
>> -Simon
>> 
>> --Original Message Text---
>> From: Benjie Chen
>> Date: Tue, 13 Nov 2007 17:50:41 -0500
>> 
>> If you disable PR, the problem will go away. Below are some useful commands. Sean mentioned that you could disable it, then set it to do an automatic PR 1 hour after you restart the automatic PR. automatic PR does not crash the system 
>> definitively, but manual PR does. So during your downtime, you could try to do an automatic PR... 
>> 
>> # Disables patrol reads on all adapters
>> megacli -AdpPR -Dsbl -aALL
>> 
>> # Enables automatic patrol reads
>> megacli -AdpPR -EnblAuto -aALL
>> 
>> # Sets the interval for automatic reads to 1 hour - it only
>> 
>> # accepts whole numbers so that's the lowest you can go
>> megacli -AdpPR -SetDelay 1 -aALL
>> 
>> 
>> Some other useful ones:
>> 
>> # Patrol read settings and information
>> megacli -AdpPR -Info -aALL
>> 
>> # Extended information
>> 
>> megacli -AdpAllInfo -aALL
>> 
>> # Export controller's event log to file
>> megacli -AdpEventLog -IncludeDeleted -f <fileName> -aALL
>> 
>> # More logging
>> megacli -FwTermLog -Dsply -aALL
>> 
>> 
>> On 11/13/07, Simon <simon at optinet.com> wrote: Hello,
>> 
>> I'm just wondering, was this ever resolved? I was about to start using
>> new 2950 with Perc5 in it, but now I'm afraid to as I cannot afford
>> downtime. Why this is still a linux hack is beyond me. The way Dell 
>> is doing, they ought to have a port specifically for FreeBSD
>> 
>> If I disable PR altogether (not sure if this is possible, yet), although I
>> don't see why it wouldn't be, would the mentioned problem go away? 
>> 
>> Thank you,
>> Simon
>> 
>> 
>> On Mon, 01 Oct 2007 17:26:29 -0400, Sean McAfee wrote:
>> 
>> 
>>>John Baldwin wrote:
>>>
>>>>On Saturday 29 September 2007 09:18:17 pm Benjie Chen wrote:
>>>>
>>>>Hmm, I haven't tried with megacli, but an internal tool at work is able to 
>>>>start manual patrol reads w/o causing a crash, and I've also seen production
>>>>boxes running automatic patrol reads w/o causing crashes.  Do you have to
>>>>have a certain load before it will crash? 
>> 
>> 
>>>The crashes that we've seen in production have occurred while patrol
>>>reads kick off under moderate-high load, but in testing, an automatic
>>>read will complete fine.  Even with maxed-out I/O*, we haven't been able 
>>>to come up with reliable testing scenario to trigger crashes on
>>>automatic patrol reads.
>> 
>> 
>> 
>>>(*My base testing scenario involved running a pretty heavy stress [as in
>>>the program available in ports], while repeatedly copying ports & src 
>> 
>>>from an NFS mount to another local mountpoint and SCPing a large file in
>> 
>>>a loop from another machine.)
>> 
>> 
>> 
>>>Sean McAfee
>>>Collaborative Fusion, Inc.
>>>  smcafee at collaborativefusion.com
>>> 412-422-3463 x 4025
>> 
>> 
>>>1710 Murray Avenue, Suite 320
>>>Pittsburgh, PA 15217
>> 
>> 
>>>****************************************************************
>>>IMPORTANT: This message contains confidential information 
>>>and is intended only for the individual named. If the reader of
>>>this message is not an intended recipient (or the individual
>>>responsible for the delivery of this message to an intended
>>>recipient), please be advised that any re-use, dissemination, 
>>>distribution or copying of this message is prohibited. Please
>>>notify the sender immediately by e-mail if you have received
>>>this e-mail by mistake and delete this e-mail from your system.
>>>E-mail transmission cannot be guaranteed to be secure or 
>>>error-free as information could be intercepted, corrupted, lost,
>>>destroyed, arrive late or incomplete, or contain viruses. The
>>>sender therefore does not accept liability for any errors or
>>>omissions in the contents of this message, which arise as a 
>>>result of e-mail transmission.
>>>****************************************************************
>> 
>> 
>> 
>>>_______________________________________________
>>>freebsd-hardware at freebsd.org mailing list
>>>http://lists.freebsd.org/mailman/listinfo/freebsd-hardware
>>>To unsubscribe, send any mail to " freebsd-hardware-unsubscribe at freebsd.org"
>> 
>> 
>> 
>> 
>> _______________________________________________
>> freebsd-hardware at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-hardware
>> To unsubscribe, send any mail to " freebsd-hardware-unsubscribe at freebsd.org"
>> 
>> 
>> 
>> 
>> 





More information about the freebsd-hardware mailing list