bin/145748: hexdump(1) %s format qualifier broken

Wed Apr 21 23:09:40 UTC 2010

On Wed, Apr 21, 2010 at 9:50 AM, Bruce Evans <brde at optusnet.com.au> wrote:
> On Wed, 21 Apr 2010, Garrett Cooper wrote:
>
>> On Tue, Apr 20, 2010 at 7:33 AM, Wayne Sierke <ws at au.dyndns.ws> wrote:
>> >> The fact that "%4s" fails and isn't noted in the addendum is a failure
>> >> according to the specifications of hexdump as per the manpage; "%.4s"
>> >> passing is a reasonable workaround for broken "%[:digit:]s"
>> >> functionality.
>> >
>> > I should have made my earlier reply more explicit. It doesn't seem to be
>> > a failure.
>>
>> The issue with %4s failing is still a failure. The non-issue with
>> %.4s, %0.4s etc not failing is not a failure; it's just a bit more
>> obfuscated logic.
>
> The behaviour is as documented.  %4s is an invalid format since it has
> a field width but no precision.  %.4s is a valid format since it has a
> precision.
>
>> > The part of the hexdump(1) manpage quoted previously:
>> >
>> > o A byte count or field precision is required for each ``s'' con-
>> > version character (unlike the fprintf(3) default which prints
>> > the entire string if the precision is unspecified).
>>
>> That statement is misleading. It should make the above statement with
>> field width, not [field] precision. FWIW, the statement `field
>> precision' makes absolutely no sense in the terminology used by
>> printf(3), and is most likely a typo.
>
> Nothing misleading there.  The man page should and does match the code,
> which takes a field precision.  The statement `field precision' exactly
> matches printf(3) terminology.

printf(3) doesn't explicitly mention `field precision' in the manpage,
but yeah... I can kind of see the implicit details in the wording.

> I think the field precision, if any, is supposed to be silently ignored,
> and the man page doesn't say enough about that, and the code may have
> bugs with it, causing the present confusion.  I haven't checked exactly
> why hexdump uses the precision and not the field width, but this
> behaviour makes some sense.  Use of the field width would pad the
> string, while use of the precision clips it, and hexdump apparently
> only supports the latter.

bpad() in display.c handles this part. I need to figure out what's
going on here to better assess why format the field width couldn't
deliver this functionality and why this instead has to be regurgitated
via a considerable amount of logic in conv.c, display.c, and parse.c.

>> And finally, yes I agree that %s is illegal because you can't qualify
>> the number of characters required for each format unit -- something
>> that's required for hexdump to function. %4s, etc with precision not
>> being specified is legal however.
>
> %4s doesn't have any precision.  I think %<something>s is supposed to
> be legal if there is a precision or a byte count.  However, without
> these, silently ignoring the field width in %4s reduces it to %s, so
> it should cause the same error as %s.
>
>> > And as observed hexdump does accept the required value when passed a
>> > "field precision" - the numeric value immediately after the period in
>> > "%.4s" (NB not a "field width" - as described in fprintf(3) and slightly
>> > more clearly in printf(3)).
>>
>> From printf(3):
>>
>>     o   An optional decimal digit string specifying a minimum field width.
>>         If the converted value has fewer characters than the field width,
>> it
>>         will be padded with spaces on the left (or right, if the
>> left-adjust-
>>         ment flag has been given) to fill out the field width.
>>
>>     o   An optional precision, in the form of a period . followed by an
>>         optional digit string.  If the digit string is omitted, the
>> precision
>>         is taken as zero.  This gives the minimum number of digits to
>> appear
>>         for d, i, o, u, x, and X conversions, the number of digits to
>> appear
>>         after the decimal-point for a, A, e, E, f, and F conversions, the
>>         maximum number of significant digits for g and G conversions, or
>> the
>>         maximum number of characters to be printed from a string for s
>> con-
>>         versions.
>>
>> Note the word `optional' in the first and second clauses. `.' isn't
>> required except to disambiguate precision from field width.
>
> The "." is part of the syntax for a precision, so it is required to specify
> a precision.

    I've done a lot of mental digesting and additional reading, and
yes I now see what you and Wayne were trying to tell me (my
misunderstanding of the importance of precision for %s qualifiers as I
was mentally inserting precision as it pertains to mathematics, which
doesn't make sense when applied to string formats).
    Please close this bug; it's invalid.
Thanks,
-Garrett