FreeBSD CVSROOT scripts and spaces in directory and file names

Ruslan Ermilov ru at FreeBSD.org
Sat Jun 21 14:23:00 PDT 2003


Hi there!

Our local CVSROOT scripts are based on a 1997 year version of FreeBSD
CVSROOT scripts, slightly modified to fit local needs, and we recently
noticed a problem with not getting email notifications if the affected
repository path has spaces in its name.  I have verified that the
modern CVSROOT scripts (which I am planning to migrate to soon) are
subject to this problem as well.

There is a couple of problems actually.

The first problem turns out to be a known limitation in cvs(1), in
a way how it constructs the argument to the log_accum.pl script
(using the %s in CVSROOT/loginfo).  With %s, both repository path
and the list of files (added, removed, or changed) in that directory
are passed as a single argument to the log_accum.pl script, separated
by a single whitespace character, which obviously creates a problem
for a parser in log_accum.pl that does a simple split().

The solution that works here and was inspired by reading the following
comment in src/contrib/cvs/src/logmsg.c, describing the % format
specifiers,

	/* All other characters, we insert an empty field (but
	   we do put in the comma separating it from other
	   fields).  This way if future CVS versions add formatting
	   characters, one can write a loginfo file which at least
	   won't blow up on an old CVS.  */
	/* Note that people who have to deal with spaces in file
	   and directory names are using space to get a known
	   delimiter for the directory name, so it's probably
	   not a good idea to ever define that as a formatting
	   character.  */

was to use %{ s} to get a known " ," delimiter.  (Chances still
are that the file/directory name will contain the " ," sequence,
but this is less likely.)

With this fix, no emails were lost anymore, but the contents were
still damaged.  If a directory name had a space in its name, the
log_accum.pl::get_revision_number() that does a simple split()
when parsing the "Repository revision" string from "cvs -Qn status",
hits the whitespace problem.  That was easy to fix, by limiting
splitting to four elements only.  Then I hit a real PROBLEM.

The log_accum.pl script parses the standard log message generated
by logmsg.c, to build three lists of added, removed, and modified
files, each hashed by the "tag" (yes, it's possible to commit to
different branches in one commit, and even to commit to a single
file twice in a single commit).  I first thought that I could just
take the list of files from the argument to the script (generated
by %{ s}), but there is no indication of "added/removed/changed",
and, more importantly, there is no "tag" information.  And the
problem with building these lists by parsing the log message is
that the lists of added, removed, and modified files are formatted
by logmsg.c as separated by a single whitespace character (and to
make it fit the 70-column width), which creates the same problem
if a file has a space in a name.

My first hack was to patch logmsg.c to have it list files one per
line, so, for example, instead of

	Modified files:
		a b c 

I have now been getting

	Modified files:
		a 
		b 
		c 

This worked well, but I wanted a real SOLUTION.  The solution
I have found is to implement the %T modifier for "loginfo"
scripts, which prints the tag (or HEAD for trunk).  Combined
with the %V and %v modifiers which print "NONE" for added
and removed files, respectively, I have been able to fix the
log_accum.pl's logic to not depend on the contents of the log
message to build lists of added, removed, and modified files.

Attached in this email are:

- a patch for src/contrib/cvs that implements the %T (tag)
  modifier for CVSROOT/loginfo scripts,

- a patch for log_accum.pl and loginfo that uses %T to deal
  with whitespaces in directory and file names.

I'd like some input here before I submit this patch to CVS
developers.

P.S.  The patch also removes two /usr/local/bin/awake uses
from log_accum.pl, as these made it unuseable for third
party application.  Josef, Peter suggested me to ask you
to please address this particular problem with "awake",
and to make this script useful for third-parties again.


Cheers,
-- 
Ruslan Ermilov		Sysadmin and DBA,
ru at sunbay.com		Sunbay Software Ltd,
ru at FreeBSD.org		FreeBSD committer
-------------- next part --------------
Index: doc/cvs.texinfo
===================================================================
RCS file: /home/ncvs/src/contrib/cvs/doc/cvs.texinfo,v
retrieving revision 1.1.1.13
diff -u -p -r1.1.1.13 cvs.texinfo
--- doc/cvs.texinfo	2 Dec 2002 03:13:37 -0000	1.1.1.13
+++ doc/cvs.texinfo	21 Jun 2003 20:21:34 -0000
@@ -12888,6 +12888,8 @@ separators.  The format characters are:
 @table @t
 @item s
 file name
+ at item T
+tag
 @item V
 old version number (pre-checkin)
 @item v
Index: src/logmsg.c
===================================================================
RCS file: /home/ncvs/src/contrib/cvs/src/logmsg.c,v
retrieving revision 1.11
diff -u -p -r1.11 logmsg.c
--- src/logmsg.c	2 Dec 2002 03:17:48 -0000	1.11
+++ src/logmsg.c	21 Jun 2003 20:24:12 -0000
@@ -683,6 +683,15 @@ title_proc (p, closure)
 				  strlen (str_list) + strlen (p->key) + 5);
 		    (void) strcat (str_list, p->key);
 		    break;
+		case 'T':
+		    str_list =
+			xrealloc (str_list,
+				  (strlen (str_list)
+				   + (li->tag ? strlen (li->tag) : 0)
+				   + 10)
+				  );
+		    (void) strcat (str_list, (li->tag ? li->tag : "HEAD"));
+		    break;
 		case 'V':
 		    str_list =
 			xrealloc (str_list,
@@ -767,6 +776,7 @@ logfile_write (repository, filter, messa
        `}' as separators.  The format characters are:
 
          s = file name
+	 T = tag
 	 V = old version number (pre-checkin)
 	 v = new version number (post-checkin)
 
Index: src/mkmodules.c
===================================================================
RCS file: /home/ncvs/src/contrib/cvs/src/mkmodules.c,v
retrieving revision 1.12
diff -u -p -r1.12 mkmodules.c
--- src/mkmodules.c	2 Sep 2002 05:57:13 -0000	1.12
+++ src/mkmodules.c	21 Jun 2003 20:16:33 -0000
@@ -69,6 +69,7 @@ static const char *const loginfo_content
     "# characters are:\n",
     "#\n",
     "#   s = file name\n",
+    "#   T = tag\n",
     "#   V = old version number (pre-checkin)\n",
     "#   v = new version number (post-checkin)\n",
     "#\n",
-------------- next part --------------
diff --exclude=CVS -ru CVSROOT-freebsd/loginfo CVSROOT/loginfo
--- CVSROOT-freebsd/loginfo	Fri Feb 28 05:35:48 2003
+++ CVSROOT/loginfo	Sat Jun 21 22:55:15 2003
@@ -27,4 +27,4 @@
 #DEFAULT (echo ""; id; echo %s; date; cat) >> $CVSROOT/CVSROOT/commitlog
 # or
 #DEFAULT (echo ""; id; echo %{sVv}; date; cat) >> $CVSROOT/CVSROOT/commitlog
-DEFAULT		$CVSROOT/CVSROOT/log_accum.pl %s
+DEFAULT		$CVSROOT/CVSROOT/log_accum.pl %{ TVvs}
diff --exclude=CVS -ru CVSROOT-freebsd/log_accum.pl CVSROOT/log_accum.pl
--- CVSROOT-freebsd/log_accum.pl	Fri Feb 28 20:28:11 2003
+++ CVSROOT/log_accum.pl	Sat Jun 21 23:00:25 2003
@@ -34,10 +34,7 @@
 #
 ############################################################
 my $STATE_NONE    = 0;
-my $STATE_CHANGED = 1;
-my $STATE_ADDED   = 2;
-my $STATE_REMOVED = 3;
-my $STATE_LOG     = 4;
+my $STATE_LOG     = 1;
 
 my $BASE_FN       = "$cfg::TMPDIR/$cfg::FILE_PREFIX";
 my $LAST_FILE     = $cfg::LAST_FILE;
@@ -233,7 +230,7 @@
 	while (<RCS>) {
 		if (/^[ \t]*Repository revision/) {
 			chomp;
-			my @revline = split;
+			my @revline = split(" ", $_, 4);
 			$revision = $revline[2];
 			$revline[3] =~ m|^$CVSROOT/+(.*),v$|;
 			$rcsfile = $1;
@@ -635,8 +632,7 @@
 # Initialize basic variables
 #
 my $input_params = $ARGV[0];
-my ($directory, @filenames) = split " ", $input_params;
-#@files = split(' ', $input_params);
+my ($directory, @fileinfo) = split " ,", $input_params;
 
 my @path = split('/', $directory);
 my $dir;
@@ -647,6 +643,23 @@
 }
 $dir = $dir . "/";
 
+my @filenames;
+my %added_files;		# Hashes containing lists of files
+my %changed_files;		# that have been changed, keyed
+my %removed_files;		# by branch tag.
+foreach (@fileinfo) {
+	my ($tag, $oldrev, $newrev, $filename) = split(/,/, $_, 4);
+	push @filenames, $filename;
+	if ($oldrev eq "NONE") {
+		push @{ $added_files{$tag} }, $filename;
+	} elsif ($newrev eq "NONE") {
+		push @{ $removed_files{$tag} }, $filename;
+	} else {
+		push @{ $changed_files{$tag} }, $filename;
+	}
+	&append_line($TAGS_FILE, $tag);
+}
+
 #
 # Throw some values at the developer if in debug mode
 #
@@ -697,7 +710,6 @@
 
 	&do_changes_file(@text);
 	&mail_notification(@text);
-	system("/usr/local/bin/awake", $directory);
 	&cleanup_tmpfiles();
 	exit 0;
 }
@@ -705,49 +717,26 @@
 #
 # Iterate over the body of the message collecting information.
 #
-my %added_files;		# Hashes containing lists of files
-my %changed_files;		# that have been changed, keyed
-my %removed_files;		# by branch tag.
-
 my @log_lines;			# The lines of the log message.
 
-my $tag = "HEAD";		# Default branch is HEAD.
 my $state = $STATE_NONE;	# Initially in no state.
 
 while (<STDIN>) {
 	s/[ \t\n]+$//;		# delete trailing space
 
-	# parse the revision tag if it exists.
-	if (/^Revision\/Branch:(.*)/)	{ $tag = $1;	 next; }
-	if (/^[ \t]+Tag: (.*)/)		{ $tag = $1;	 next; }
-	if (/^[ \t]+No tag$/)		{ $tag = "HEAD"; next; }
-
-	# check for a state change, guarding against similar markers
-	# in the log message itself.
-	unless ($state == $STATE_LOG) {
-		if (/^Modified Files/)	{ $state = $STATE_CHANGED; next; }
-		if (/^Added Files/)	{ $state = $STATE_ADDED;   next; }
-		if (/^Removed Files/)	{ $state = $STATE_REMOVED; next; }
-		if (/^Log Message/)	{ $state = $STATE_LOG;	   next; }
+	# don't do anything if we're not in a state.
+	if ($state == $STATE_NONE) {
+		$state = $STATE_LOG if /^Log Message/;
+		next;
 	}
 
-	# don't so anything if we're not in a state.
-	next if $state == $STATE_NONE;
-
 	# collect the log line (ignoring empty template entries)?
 	if ($state == $STATE_LOG) {
 		next if /^(.*):$/ and $cfg::TEMPLATE_HEADERS{$1};
 
 		push @log_lines, $_;
 	}
-
-	# otherwise collect information about which files changed.
-	my @files = split;
-	push @{ $changed_files{$tag} },	@files if $state == $STATE_CHANGED;
-	push @{ $added_files{$tag} },	@files if $state == $STATE_ADDED;
-	push @{ $removed_files{$tag} },	@files if $state == $STATE_REMOVED;
 }
-&append_line($TAGS_FILE, $tag);
 
 #
 # Strip leading and trailing blank lines from the log message.
@@ -896,7 +885,6 @@
 	&mail_notification(@log_msg);
 }
 
-system("/usr/local/bin/awake", $directory);
 &cleanup_tmpfiles();
 exit 0;
 # EOF
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20030622/90711b1a/attachment.bin


More information about the freebsd-hackers mailing list