vinum performance

Pete Carah pete at ns.altadena.net
Wed Apr 2 10:57:26 PST 2003


This whole thing takes me back to my old SGI days; we had an array
on one machine that was meant to stream uncompressed HDTV data (this
runs about 1gbit/sec in plain rgb; the sgi video adapters wanted
padding to 32bits/pixel so it turns out to be around 1.2-1.4 gbits/sec).

Raid 5 was not a consideration; with the controllers in question it
was faster to just telecine the film again than to do a parity
recovery (film is a *wonderful* storage medium!!)  (plus the write-speed
demands are pretty strict too, even though the telecine was on a single
hippi channel so a bit slower than the playout speed.  At least it was
a step (drum) telecine so didn't care about missing the frame rate.)

The array was 40 drives on 4 fiber-channel controllers.

The stripe parameters were chosen to match the size of a video frame
(about 150-160meg for color) to the size of one stripe across the whole 
array - there was a little padding needed to make this come out even with 
stripes being mults of 512 bytes...  (and to get around some of Greg's 
other hints, you get some seek-independence and lots of other overhead 
help (OS DMA setup) by making the cross-controller vary fastest and 
in-controller slowest)

This stripe scheme is *very* particular to one kind of performance
optimization (BIG specific-io-size streaming); it would be terrible for 
usenet, for example.  You could take it as one extreme with 
transaction-database storage probably the other (where reliability is
often judged more important than raw speed, and transactions generally
fit in one IO request.  Also the read part of the transaction can be
cached easily and thus the write only involves steps 3 and 4 of the
raid-5 steps mentioned before).  Remember the 3-way tradeoff mentioned 
earlier in this thread...

And at least 2 yrs ago, none of the major raid cabinet folks made 
(stock) arrays that optimized this kind of streaming performance; they 
all aimed at database customers.

This was on a cray-link challenge machine with a 2-3 gbit/sec backplane
and memory, btw; drive array set up as jbod with xfs software raid.  
Lucky I didn't have to pay for it :-)

(and you had to turn off xfs journaling and other such things that could
get you without knowing quite why...)  Fortunately the SGI graphics folk
furnished scripts that normally got this right.  We often needed to 
restripe the array for each transfer, and always newfs to get the 
sequential write properties right.

-- Pete


More information about the freebsd-stable mailing list