svn commit: r196981 - head/usr.bin/unzip

Tim Kientzle kientzle at freebsd.org
Wed Sep 9 15:16:11 UTC 2009


Andrey Chernov wrote:
> On Tue, Sep 08, 2009 at 03:55:13PM +0000, Roman Divacky wrote:
>> +	 * Detect whether this is a text file.  ...  but libarchive
>> +	 * does not read the central directory, so we have to
>> +	 * guess ...
>> +	 */
>> +	if (a_opt && n == 0) {
>> +		for (p = buffer; p < end; ++p) {
>> +			if (!isascii((unsigned char)*p)) {
>> +				text = 0;
>> +				break;
>> +			}
>> +		}
>> +	}
>> +
> 
> If I understand the purpose of this code right, better use
> isalnum()+ispunct()+ispace()
> combination to count non-ASCII people too.
> Also setlocale() call must be added to the main() for that.

Personally, I would rather see unzip just ignore the -a
option entirely, but I suppose that's probably infeasible.

Since this is only to support -a (which does end-of-line
conversions), I would suggest using a rather different
set of heuristics that examines end-of-line sequences
and control characters only:
   * Any byte value <31 that's not CR or LF: not text
   * LF not preceded by CR: not text
   * CR not followed by LF: not text (or at least, not DOS text)
   * Otherwise, it is text.

At a minimum, this dodges the locale issue.

Someday, I'll get around to filling in the seek support
that libarchive needs for reading central directories,
then unzip can look at the "text file" bit (which
is no more reliable than anything described above) and
this code can just go away.

Tim


More information about the svn-src-head mailing list