i386/68719: [usb] USB 2.0 mobil rack+ fat32 performance problem

Sat May 28 04:12:26 PDT 2005

On Saturday 28 May 2005 11:36, Bruce Evans wrote:
> On Fri, 27 May 2005, Dominic Marks wrote:
> > (Posted to freebsd-fs as the PR is assigned to freebsd-usb@, but it seems
> > to be more related to the msdos filesystem than the USB system so perhaps
> > it should be reassigned?)
>
> It should be.  It is even less i386-specific than usb-specific.
>
> > I've been evaluating the performance of some usb2 hard discs with FreeBSD
> > and I found this PR (68719). The submitter is correct that performance
> > with msdosfs is severely limited.
> >
> > I tested a 'LaCie' USB2 disc:
> > ...
> > In test 1 I could not achieve any better than 5.1MB/s on an msdosfs
> > filesystem. Using UFS2 and softupdates a transfer rate of 22~25MB/s was
> > possible. Both test data sets were copied from the systems ATA-100 disc.
> > In both tests at these peaks gstat reports the device is 100% busy.
>
> I use the following to improve transfer rates for msdosfs.  The patch is
> for an old version so it might not apply directly.
>
> %%%
> Index: msdosfs_vnops.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/fs/msdosfs/msdosfs_vnops.c,v
> retrieving revision 1.147
> diff -u -2 -r1.147 msdosfs_vnops.c
> --- msdosfs_vnops.c	4 Feb 2004 21:52:53 -0000	1.147
> +++ msdosfs_vnops.c	22 Feb 2004 07:27:15 -0000
> @@ -608,4 +622,5 @@
>   	int error = 0;
>   	u_long count;
> +	int seqcount;
>   	daddr_t bn, lastcn;
>   	struct buf *bp;
> @@ -693,4 +714,5 @@
>   		lastcn = de_clcount(pmp, osize) - 1;
>
> +	seqcount = ioflag >> IO_SEQSHIFT;
>   	do {
>   		if (de_cluster(pmp, uio->uio_offset) > lastcn) {
> @@ -718,5 +740,5 @@
>   			 */
>   			bp = getblk(thisvp, bn, pmp->pm_bpcluster, 0, 0, 0);
> -			clrbuf(bp);
> +			vfs_bio_clrbuf(bp);
>   			/*
>   			 * Do the bmap now, since pcbmap needs buffers
> @@ -767,11 +789,19 @@
>   		 * without delay.  Otherwise do a delayed write because we
>   		 * may want to write somemore into the block later.
> +		 * XXX comment not updated with code.
>   		 */
> +		if ((vp->v_mount->mnt_flag & MNT_NOCLUSTERW) == 0)
> +			bp->b_flags |= B_CLUSTEROK;
>   		if (ioflag & IO_SYNC)
> -			(void) bwrite(bp);
> -		else if (n + croffset == pmp->pm_bpcluster)
> +			(void)bwrite(bp);
> +		else if (vm_page_count_severe() || buf_dirty_count_severe())
>   			bawrite(bp);
> -		else
> -			bdwrite(bp);
> +		else if (n + croffset == pmp->pm_bpcluster) {
> +			if ((vp->v_mount->mnt_flag & MNT_NOCLUSTERW) == 0)
> +				cluster_write(bp, dep->de_FileSize, seqcount);
> +			else
> +				bawrite(bp);
> +  		} else
> +  			bdwrite(bp);
>   		dep->de_flag |= DE_UPDATE;
>   	} while (error == 0 && uio->uio_resid > 0);
> %%%

Thanks! I'll try my three tests again with this patch.

> Notes:
> - The xxx_count_severe() stuff doesn't work quite right and was observed
>    to work especially badly for msdosfs in some configurations.  IIRC,
>    only configurations with a tiny block size (e.g., 512 bytes) showed
>    the problem, and the problem is more likely to be with tiny block sizes
>    actually exercising the "severe" case than with msdosfs or with the
>    tiny block sizes themselves.  The behaviour was apparently that when
>    a severe page or buf shortage develops, the above handling makes the
>    problem worse by using bawrite() instead of cluster_write().  Falling
>    back to bawrite() may have made the resource shortage non-fatal, but
>    it made the resource shortage last much longer since bawrite() was much
>    slower, even on the reasonable fast ATA drive that I was testing on.
> - Using cluster_write() in the above is not essential.  bdwrite() works
>    almost as well, or perhaps even better than cluster_write() provided
>    write clustering is enabled by setting B_CLUSTEROK, since when this
>    flag is set the delayed writes are clustered when they are done
>    physically.
>
> > I have not made any tests of read performance but from looking at the
> > results I do not expect that it will be significantly better than write
> > performance. I may do some when I get more time to investigate and follow
> > up if the results are unexpected.
>
> Try it.  I would expect read performance to be much better.  If not, don't
> bother trying the above patch.  msdosfs uses read-ahead for read(), and
> this seems to work well so I haven't even tried changing it to use read
> clustering (the above only changes it to use write clustering).  This may
> depend on the drive doing read caching and not handling small block sizes
> too badly.  I mostly use ATA drives that have these properties.  Writing
> tinygrams tends to have a relatively higher cost because write caching is
> not enabled so clustering can only be done by the OS.

Ok, I still have all the test equipment so I might as well do this today. I 
have ATA write caching enabled on my systems.

> > Hopefully this will generate some interest in the problem, it is beyond
> > my time and expertise but it would be very nice to be able to access
> > MS-DOS formatted filesystems at a reasonable speed!
>
> Some other changes are needed for general use at a reasonable speed:
> - use VMIO for metadata.
> - don't use pessimal block allocation.  The current allocator gives
>    large inter-file fragmentation by attempting to minimise intra-file
>    fragmentation, and when the file system becomes just 1/N full the
>    attempt backfires and gives intra-file fragmentation too (files with
>    more than N clusters are very likely to be fragmented).

Is there anyone out there who is sufficently talented, with a strong desire to 
tackle this problem? I would be happy to make the first payment, or hardware 
donation into a development fund to see it get fixed. My resources are 
limited though, so if there are others who would like this feature perhaps we 
could combine to get a volunteer some really nice kit?

> Bruce

Thanks very much,
-- 
Dominic
GoodforBusiness.co.uk
I.T. Services for SMEs in the UK.