easy way to determine if a stream or fd is seekable

Alexander Best arundel at freebsd.org
Sun Nov 20 15:36:55 UTC 2011


On Sat Nov 19 11, Tim Kientzle wrote:
> 
> On Nov 18, 2011, at 12:31 PM, Alexander Best wrote:
> 
> > On Fri Nov 18 11, Tim Kientzle wrote:
> >> 
> >> Take a look at 
> >> 
> >> http://libarchive.googlecode.com/svn/trunk/libarchive/archive_read_open_filename.c
> >> 
> >> Especially the comments about detecting "disk-like" devices.
> >> I rewrote a bunch of this code to introduce an explicit
> >> notion of "strategy" so that we could optimize access
> >> to a variety of different devices.
> >> 
> >> This code has a notion of "disk-like" file descriptors and
> >> some optimizations for such.  There are some comments
> >> in there outlining similar optimizations that could be made
> >> for "tape-like" or "socket-like" devices.
> > 
> > great you posted that file as reference. i believe most of the stuff done there
> > should actually be done within lseek().
> 
> Libarchive runs on a lot of systems other than FreeBSD.
> FreeBSD is not the only Unix-like system with this issue,
> so that code isn't going to go out of libarchive regardless.
> 
> If you think those same ideas can be used in dd or hd
> to speed them up, please send your patches.

i'd like to propose the followup patch for hexdump(1). basically,
the logic behind is is this:

1) if the file argument is a fifo, pipe or socket   --  goto 4)
2) if the file argument is a tape drive		    --  goto 4)
3) for all other cases try fseeko(), if that fails  --  goto 4)

4) use getchar()

you should notice a dramtic increase in speed from something like the
following:

'hexdump -s 500m -n32 /dev/random'

cheers.
alex

> 
> The key point:  You cannot unconditionally call lseek()
> to skip over data.  Instead, treat lseek() as an optimization
> that can be used under some circumstances.  The
> question then becomes one of figuring out when
> that optimization can be enabled.
> 
> Tim
> 
-------------- next part --------------
diff --git a/usr.bin/hexdump/display.c b/usr.bin/hexdump/display.c
index 991509d..8c8b065 100644
--- a/usr.bin/hexdump/display.c
+++ b/usr.bin/hexdump/display.c
@@ -35,8 +35,10 @@ static char sccsid[] = "@(#)display.c	8.1 (Berkeley) 6/6/93";
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
+#include <sys/ioctl.h>
 #include <sys/param.h>
 #include <sys/stat.h>
+#include <sys/conf.h>
 
 #include <ctype.h>
 #include <err.h>
@@ -368,7 +370,7 @@ next(char **argv)
 void
 doskip(const char *fname, int statok)
 {
-	int cnt;
+	int type;
 	struct stat sb;
 
 	if (statok) {
@@ -380,16 +382,38 @@ doskip(const char *fname, int statok)
 			return;
 		}
 	}
-	if (S_ISREG(sb.st_mode)) {
-		if (fseeko(stdin, skip, SEEK_SET))
+	if (S_ISFIFO(sb.st_mode) || S_ISSOCK(sb.st_mode)) {
+		noseek();
+		return;
+	}
+	if (S_ISCHR(sb.st_mode) || S_ISBLK(sb.st_mode)) {
+		if (ioctl(fileno(stdin), FIODTYPE, &type))
 			err(1, "%s", fname);
-		address += skip;
-		skip = 0;
-	} else {
-		for (cnt = 0; cnt < skip; ++cnt)
-			if (getchar() == EOF)
-				break;
-		address += cnt;
-		skip -= cnt;
+		/*
+		 * Most tape drives don't support seeking,
+		 * yet fseeko() would succeed.
+		 */
+		if (type & D_TAPE) {
+			noseek();
+			return;
+		}
+        }
+	if (fseeko(stdin, skip, SEEK_SET)) {
+		noseek();
+		return;
 	}
+	address += skip;
+	skip = 0;
+}
+
+void
+noseek(void)
+{
+	int count;
+
+	for (count = 0; count < skip; ++count)
+		if (getchar() == EOF)
+			break;
+	address += count;
+	skip -= count;
 }
diff --git a/usr.bin/hexdump/hexdump.h b/usr.bin/hexdump/hexdump.h
index be85bd9..1d4bb85 100644
--- a/usr.bin/hexdump/hexdump.h
+++ b/usr.bin/hexdump/hexdump.h
@@ -97,6 +97,7 @@ u_char	*get(void);
 void	 newsyntax(int, char ***);
 int	 next(char **);
 void	 nomem(void);
+void	 noseek(void);
 void	 oldsyntax(int, char ***);
 size_t	 peek(u_char *, size_t);
 void	 rewrite(FS *);


More information about the freebsd-hackers mailing list