Massive libxo-zation that breaks everything

Wed Mar 4 06:04:48 UTC 2015

On  3 Mar, David Chisnall wrote:
> On 3 Mar 2015, at 01:32, Andrey Chernov <ache at FreeBSD.org> wrote:
>> 
>> So, why you ever need to modify wc? Just load wc inside your
>> json/xml/etc writer, replacing its printf at the ld-elf.so level.
> 
> You can't get structured output from printf() because printf() takes
> unstructured input.  It's a string with some variables pasted in, but
> no awareness of context.
> 
> The libxo changes to the tools are simply marking the output as having
> a defined structure.  The library then translates this abstract
> structure into something that can be parsed easily by external tools.
> 
> If your argument is that the UNIX philosophy is simple tools doing one
> thing well, then please remember the context of this philosophy: It
> dates back to the original UNIX systems *that did not support shared
> libraries* and was an argument used to justify not implementing them. 
> This is why globbing is in the shell instead of a shared library and
> why some variant of mv *.a *.b works on every command-line interface
> except for a UNIX shell.
> 
> Even with that in mind, small changes to individual tools are a *lot*
> simpler than one massive monolithic tool that understands the output
> formats of every other tool in the base system and can transform them.
>  Why do you think a few library calls in each application is more
> complex *than an entire parser per tool*?

The proper *nix way is probably for all the tools to only generate
machine parsable output in XML or whatever.  When human readable output
is desired, the output would be piped to a formatter that takes as an
argument a file describing the output format that it would use as a
template.  The formatter should be simple because it only has to
understand two input formats: the machine readable output common to all
the base tools, and the template file format.

netstat is a complicated example because it has a bunch of different
output formats to handle all of the different reports that it can
generate.  Multiple output template files would be required.

#!/bin/sh
[complicated argument handling here]
/usr/bin/netstat_xml $* | /usr/bin/humanize_xml -f $netstatformatfile

This scheme makes it easy to generate customized human-readable reports
because they only require new template files.  No messing around with
the source code and recompiling is necessary.  As it is, ordinary users
can't do that with netstat because it it setgid.

Unfortunately this doesn't map well to /rescue.

Translating human readable output to a more machine friendly one is a
lot more difficult and more likely to be buggy.  What happens when the
columns run together in the pretty human-oriented output because some of
the values got too large?  How do you do regression testing on that?