Hard disk bottle neck.

Matthew Seaman m.seaman at infracaninophile.co.uk
Sun Sep 28 12:30:10 UTC 2008


Danny Do wrote:
> Hi guys,
> 
>  
> 
> I have this problem for years but couldn't find a way to solve it.
> 
> I have a file server handling large files from 1MByte to 1GByte. 
> 
> Server Info:
> FreeBSD 6.2 
> Apache 2.2.9
> 
> DELL PowerEdge 1850
> 2GB RAM (only 184MB is active)
> 6x300MB SCSI 10K RPM RAID5
> Gigabit Ethernet Connection
> 
> My server can output NO MORE than 60Mbps (read only). 
> 
> The bottle neck is the hard disk. If I use ONE connection to download file
> from my server, the speed can go up to about 400Mbps. 
> 
> If I let visitors download using multiple connections, the server cannot
> output more than 60Mbps. 
> 
> My service is similar to rapidshare/megaupload, I am wondering how they
> configure their servers?
> 
> If I recall correctly, it doesn't cost much time to read the data from the
> disk but it does cost a lot of time to seek for the data. Correct me if I am
> wrong, if I increase the read buffer size, there would be less disk seek
> (disk access). Let's say the read buffer is 64K, if I increase it to 640K,
> the disk seek would reduce by 90%. Thus, more data can be read from the hard
> drive.
> 
> What should I do now?

Try some different webservers. Apache is great, but it is designed to
be maximally flexible and capable of doing anything you can imagine
rather than to be absolutely as fast as possible.

There are some light-weight servers which have put work into optimizing
delivery of static content -- usually spoken of in the context of serving 
images but any static files will be suitable material.  Personally, I 
really like nginx for this.  Lots of people go for lighttpd and there are
a number of other alternatives in ports.

Also, depending on exactly how much content you have to serve and whether
certain items are very much more popular than others, a reverse proxy / memory cache (a.k.a http accelerator) may help.  varnish is the obvious
candidate here, but you'll have to experiment a bit to see what the optimal
settings are and if it actually helps at all.

If your website runs using a scripting language such as PHP, then another
possibility is memcached -- although described as a cache for dynamically
generated pages, it can cache just about anything, but you will need some
sort of scripting language to interface to it from your web server.  There
are memcached APIs for all popular languages and probably a few you've 
never heard of...

The various caching strategies basically work because they keep recently
accessed files in RAM, avoiding an expensive round-trip to the HDD to
retrieve the data (memory access takes nano- or micro- seconds: disk 
accesses take milliseconds).  Of course, the OS itself also does exactly 
the same thing in a general way, and FreeBSD is already very good in this 
respect.  Caching  software however gives you more control over what gets 
cached and for how long,  enabling you to tune this specific application 
for maximum performance.

	Cheers,

	Matthew

-- 
Dr Matthew J Seaman MA, D.Phil.                   7 Priory Courtyard
                                                  Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey     Ramsgate
                                                  Kent, CT11 9PW

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 258 bytes
Desc: OpenPGP digital signature
Url : http://lists.freebsd.org/pipermail/freebsd-questions/attachments/20080928/d4461df5/signature.pgp


More information about the freebsd-questions mailing list