Recommendations for servers running SATA drives [hot-swap]

Miroslav Lachman 000.fbsd at quip.cz
Fri Oct 17 07:08:46 PDT 2008


Jeremy Chadwick wrote:
> On Fri, Oct 17, 2008 at 01:50:38PM +0200, Miroslav Lachman wrote:
> 
>>Jeremy Chadwick wrote:
>>
>>>On Thu, Oct 16, 2008 at 09:30:20PM +0200, Miroslav Lachman wrote:
>>>
>>>
>>>>Today I was replacing disk in one Sun Fire X2100 M2 so I tried   
>>>>hot-swapping. It was as you said: atacontrol detach ata3, replace the 
>>>> HDD, atacontrol attach ata3 and new disk is in the system. I tried 
>>>>it 3  times to be sure that it was not coincidence - no panic was 
>>>>produced ;o)
>>>>So in this case, hot-swapping on Sun Fire X2100 M2 with FreeBSD 7.0 
>>>>i386  works.
>>>
>>>
>>>That's excellent news.  So it seems possibly the problem I was seeing
>>>was with "reinit" causing some sort of chaos.  I'll have to check things
>>>on my testbox here at home to see how I caused the panic last time.
>>>
>>>Thanks for providing feedback, as usual!  :-)
>>
>>Unfortunately there is one problem - I see a lot of interrupts after  
>>disk swapping (about 193k of atapci1)
>>
>>Interrupts
>>197k total
>>     ohci0 21
>>     ehci0 22
>>193k atapci1 23
>>2001 cpu0: time
>>   1 bge1 273
>>2001 cpu1: time
> 
> 
> Okay, so it looks like the interrupt rate on atapci1 after swapping is
> going crazy.  What you're showing there looks like heavily modified
> vmstat -i output.

The shown is manually cropped from systat -vm, I'll try vmstat -i next 
time. ;)

>>Full output of systat -vm 2 is attached.
>>
>>It is shown in top as 50% interrupt (CPU state) and load 1 until I  
>>rebooted the machine (I can provide MRTG graphs). The system was not in  
>>production load, but almost idle. (I will put it in production tomorrow).
>>After reboot, everything is OK.
> 
> 
> And this box is running the ATA patch Andrey provided, yes?

It is clean install of FreeBSD 7.0-RELEASE-p5 amd64 without patches.

>>Can somebody test hot-swapping with SATA drives and confirm this  
>>behavior? (I can't test it now, because machine is in datacenter)
> 
> 
> I can test it on my P4SCE box.
> 
> I'll check the interrupt rates after each step of the hot-swap to see
> if/when the problem starts.

I'll check the interrupts next time too and will post results to this 
thread.

Miroslav Lachman


More information about the freebsd-stable mailing list