System hangs for several minutes (disk IO related)

Frank Leonhardt frank2 at fjl.co.uk
Wed Jul 31 13:28:42 UTC 2013


I don't know what kind of answer you're expecting unless its for moral 
support or the obvious. I was thinking of buying one of these as they're 
very cheap at the moment, but decided against it due to compatibility 
problems reported. IIRC something in it was supported up to FreeBSD 7.2 
- the NIC I think. If you get it working I'd be interested myself! I 
think they were commonly used for VMWare but won't run version 4.0 
onwards, and are therefore as desirable to that fraternity as a dead 
camel in reception.

However, I did once get the same symptoms you're reporting, and it 
turned out to be a HD that was on the way out even though it pretended 
it was fine on every test. I think it was just very slow to respond on a 
write. If the RAID is struggling to do a write I assume you'd see the 
same thing.

If I were in your place I'd try to attach a SATA drive directly - does 
it have a SATA optical drive connection you could pinch?

Regards, Frank.

On 30/07/2013 18:19, Ewald Jenisch wrote:
> Hi,
>
> I'm seeing rather strange behavior on an HP DL585 G5 wrt. disk IO:
>
> When there's any disk io the machine completely freezes, i.e. no
> console input possible, no screen output - complete hang. After some
> minutes the box comes back to normal again - but sure enough with the
> next disk io it freezes again.
>
> To give you a typical example: While a "portsnap fetch extract" was
> running I did a "sync". Normally this should complete in a matter of
> milliseconds to seconds in the worst case - but dig this:
>
> # date;time sync;date
> Tue Jul 30 09:57:38 CEST 2013
> 0.000u 0.311s 9:54.69 0.0%      4+161k 0+1287io 0pf+0w
> Tue Jul 30 10:07:38 CEST 2013
> #
>
> No, this is not a typo - it really took nearly ten minutes (!) for the
> sync to complete. In the meantime - every windows, all activity
> (console, screen-output etc.) is completely blocked. ('portsnap fetch
> extract' was only given as an example here - the lockup occurs
> whenever there is disk io like for example tar, etc).
>
> We're speaking about a machine with decent hardware here, here's an
> excerpt from "dmesg":
>
> ------------------------------ < Cut here > ------------------------------
>
> FreeBSD 9.2-BETA2 #0 r253750: Mon Jul 29 11:07:04 CEST 2013
>      root at sniff-rz2:/usr/obj/usr/src/sys/GENERIC amd64
> gcc version 4.2.1 20070831 patched [FreeBSD]
> CPU: Quad-Core AMD Opteron(tm) Processor 8358 SE (2411.16-MHz K8-class CPU)
>    Origin = "AuthenticAMD"  Id = 0x100f23  Family = 0x10  Model = 0x2  Stepping = 3
>    Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
>    Features2=0x802009<SSE3,MON,CX16,POPCNT>
>    AMD Features=0xee400800<SYSCALL,MMX+,FFXSR,Page1GB,RDTSCP,LM,3DNow!+,3DNow!>
>    AMD Features2=0x7ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS>
>    TSC: P-state invariant
> real memory  = 137438953472 (131072 MB)
> avail memory = 132973432832 (126813 MB)
> Event timer "LAPIC" quality 400
> ACPI APIC Table: <HP     ProLiant>
> FreeBSD/SMP: Multiprocessor System Detected: 16 CPUs
> ...
> ciss0: <HP Smart Array P400> port 0x3000-0x30ff mem 0xd9e00000-0xd9efffff,0xd9df0000-0xd9df0fff irq 16 at device 0.0 on pci8
> ciss0: PERFORMANT Transport
> ...
> da0 at ciss0 bus 0 scbus2 target 0 lun 0
> da0: <COMPAQ RAID 1(1+0) OK> Fixed Direct Access SCSI-5 device
> da0: 135.168MB/s transfers
> da0: Command Queueing enabled
> da0: 139979MB (286677120 512 byte sectors: 255H 32S/T 35132C)
> da0: quirks=0x1<NO_SYNC_CACHE>
>
> ------------------------------ < Cut here > ------------------------------
>
> Kernel: Latest kernel as of yesterday (9.2Beta)
>
> BIOS: is at the latest level (Support pack as of Spring 2013)
> installed which updated BIOS, iLO etc. Aside from that I reset BIOS to
> default values just to be sure.
>
> SmartArray P400 - Firmware 7.24 (latest)
>
> Harddisks: Two 146GB HDs running in Raid1-mode.  Already tried
> hot-swapping the disks - didn't change anything.
>
> Needless to say - no error message etc. in neither dmesg nor
> /var/log/messages :-(
>
> To me it looks like this is some sort of timing problem - but where
> should I start looking?
>
> Thanks much in advance for any help,
> -ewald
> _______________________________________________
> freebsd-questions at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "freebsd-questions-unsubscribe at freebsd.org"



More information about the freebsd-questions mailing list