HPC and zfs.
Michael Fuckner
michael at fuckner.net
Mon Feb 6 17:25:07 UTC 2012
On 02/06/2012 05:41 PM, Freddie Cash wrote:
Hi all,
> On Mon, Feb 6, 2012 at 8:22 AM, Jeremy Chadwick
> <freebsd at jdc.parodius.com> wrote:
>> On Mon, Feb 06, 2012 at 04:52:11PM +0100, Peter Ankerst?l wrote:
>>> I want to investigate if it is possible to create your own usable
>>> HPC storage using zfs and some network filesystem like nfs.
especially HPS sounds interesting to me- but for HPC you typicially need
fast r/w-access for all nodes in the cluster. That's why Lustre uses
several storages for concurring access over a fast link (typicially
Infiniband)
Another thing to think about is CPU: you probably need weeks for a
rebuild of a single disk in a Petabyte Filesystem- I haven't tried this
with ZFS yet, but I'm really interested if anyone already did this.
The whole setup sounds a little bit like the system shown by aberdeen:
http://www.aberdeeninc.com/abcatg/petabyte-storage.htm
schematics at tomshardware:
http://www.tomshardware.de/fotoreportage/137-Aberdeen-petarack-petabyte-sas.html
The Problem with Aberdeen is they don't use Zil/ L2Arc.
>>> Just a thought experiment..
>>> A machine with 2 6 core XEON, 3.46Ghz 12MB and 192GB of ram (or more)
>>> I addition the machine will use 3-6 SSD drives for ZIL and 3-6 SSD
>>> deives for cache.
>>> Preferrably in mirror where applicable.
>>>
>>> Connected to this machine we will have about 410 3TB drives to give approx
>>> 1PB of usable storage in a 8+2 raidz configuration.
I don't know what the situation is for the rest of the world, but 3TB
currently is still hard to buy in Europe/ Germany.
>>> Connected to this will be a ~800 nodes big HPC cluster that will
>>> access the storage in parallell
what is your typical load pattern?
>>> is this even possible or do we need to distribute the meta data load
>>> over many servers?
It is a good idea to have
>>> If that is the case,
>>> does it exist any software for FreeBSD that could accomplish this
>>> distribution (pNFS dosent seem to be
>>> anywhere close to usable in FreeBSD) or do I need to call NetApp or
>>> Panasas right away?
not that I know of
> SuperMicro H8DGi-F supports 256 GB of RAM using 16 GB modules (16 RAM
> slots). It's an AMD board, but there should be variants that support
> Intel CPUs. It's not uncommon to support 256 GB of RAM these days,
> although 128 GB boards are much more common.
Currently Intel CPUs have 3 Memory Channels.
If you have 2 Sockets, 2 Dimms per Channel, 3 Channels- 12 Dimms with
cheap 16GB Modules is 192GB. 32GB are also available today ;-)
>> - How you plan on getting roughly 410 hard disks (or 422 assuming
>> an additional 12 SSDs) hooked up to a single machine
>
> In a "head node" + "JBOD" setup? Where the head node has a mobo that
> supports multiple PCIe x8 and PCIe x16 slots, and is stuffed full of
> 16-24 port multi-lane SAS/SATA controllers with external ports that
> are cabled up to external JBOD boxes. The SSDs would be connected to
> the mobo SAS/SATA ports.
>
> Each JBOD box contains nothing but power, SAS/SATA backplane, and
> harddrives. Possibly using SAS expanders.
If you use Supermicro I would use X8DTH-iF, some LSI HBA (9200-8e, 2x
Multilane external) and some JBOD-Chassis (like SUpermicro 847E16-RJBOD1)
Regards,
Michael!
More information about the freebsd-fs
mailing list