Software Disk Cache Limit Issues

Trent George soundsampler at sbcglobal.net
Sun Jul 6 12:07:34 PDT 2003


Hi, 
 
I have a couple of questions. I have been experimenting with freebsd to provide similar functionality to netware's large write-back disk caching.
 
Why ? I have an application that does frequent, large file saves 40mb+ every second or more. I have two new systems with the great i865 chipsets with the fast new gigabit on board chips (fast northbridge connections). My ttcp performace on these machines is 120mb/s sustained. But any disk releated writes are slowed down to disk speed. (vinum raid0 with 2 drives ~ 100mb/s - 50mb/s). these machines have 2gb of ram.
using 5.1-beta, ide in udma 
 
Goal:
Allow huge amounts of file data to be cached and allow burst writes (instant return) from tcp or local, not be blocked by storage device speed (if ram allows)
 
The experiment:
I wrote a little program that writes to a large file and displays wall time progress and vfs.runningbufspace after each 1mb block to see where and how long write bottlenecks occur in applications.
 
test:sysctl vfs.hirunningspace=200000000 (200mb)
results:system would limit to 16mb and app would block on wswbuf
 
test:recompile with options NSWBUF_MIN=3200, (and tweak % of ram for nbuf)
results:application would return "instantly" after 180mb of writing :-)
downside:a second run before vfs.runningbufspace emptyed would seem to block at random places, read delays are large, usually until all runningbufspace emptied. it seemed there was blocks at "getbuf", and it would not be at hirunningspace and somtimes wait for runningbufspace was zero.
 
observations:
it seems "writeback caching of disk data" goes to nbuf then nswbuf then disk.
 
comments:
it seems nswbuf is using 64k blocks v's nbuf uses 16k blocks
the hardcoded limit high limit of 256 nswbufs limits hirunningspace
it seems space allocated for nswbuf is additional to nbuf, but no statistics show it
it seems a write needs both nbuf space and nswbuf space to "cache it"(can't it use the memory once instead of twice?)
 
questions:
could not the async write back wait until no reads are in queue ?
should not the amount of allocated space for nswbuf to "viewable" statisticlly like vfs.maxbufspace shows
it seems the vfs.numdirtybuffer has no meaning in this quest.
 
is this all too hard (after 10 experimental kernel compiles and tweaks and 5 hours later)
would this not be of interest for the new machines with growing ram and tcp speed.
 
ideal would be 
1/ to use a tunable % of ram to use for nbuf
2/ allow nbuf to fill up with write-back blocks and trickly out when device has no reads queues (or idle) with vfs.hidirtybuffers
3/ use the smaller nswbufs for vfs.hidirtybuffers and vfs.lodirtybuffers 
 
unfortunately my skill set is limited and I don't understand vm well enough, it seems 
/usr/src/sys/kern/vfs_bio.c has all the relevent code.
 
any tips would be appreciated, i don't know if this message should be more directed to scsi or hackers, there seems no vm discussion.
 
Sorry for the long post and thanks for your time. 
 
Trent George
 
 
ps: the test program read somthing like the following pseudo code
 
long vt[200],vs[200];
long len,vallen,val;
struct tm tm;
int i,f;
char *buf;
 
vallen=4;
len=1<<20;
buf=malloc(len);
f=open(test.dat,O_RDWR|O_CREAT)
gettimeofday(&tm,0)
for(i=0;i<180;i++)
  {
  oldtm=tm.tm_usec;
  write(f,buf,len);
  gettimeofday(&tm,0);
  sysctlbyname("vfs.runningbufspace",&val,&vallen,0,0);
  vt[i]=tm.tm_usec-oldtm;
  vs[i]=val;
  }
close(f);
for(i=0;i<180;i++) printf("%2d:%4d:%3d ", i, vt[i]/1000, vs[i]/1024/1024);
 
 
 


More information about the freebsd-fs mailing list