Re: My experiences with Rust

From: Isaac (.ike) Levy <ike_at_blackskyresearch.net>
Date: Mon, 25 Aug 2025 13:03:51 UTC
Hopefully fun and worthwhile side conversation, (apologies for continuing so far away from the subject line),

> On Aug 25, 2025, at 6:29 AM, Anthony Pankov <anthony.pankov@yahoo.com> wrote:
> 
> 
>>> On Aug 23, 2025, at 7:13 AM, Anthony Pankov <anthony.pankov@yahoo.com> wrote:
>>> On 23 августа 2025 г., 1:32:25 you wrote:
>>> ..
>>>> For output-only data from kernel TO BE LOGGED, text-only format would be
>>>> strictly wanted to read/process using oldies-but-goodies tools like
>>>> less, grep, awk, sed and any other thing to handle texts.
>>> 
>>> I have a conceptual question. How to ensure reliability of recognizing
>>> events    coded    in    log    string?   There is no information of all
>>> mutations   which  log string can take. 'grep "\d+ dropped"' will
>>> do  the job well for "200 dropped" till log string has a form "dropped
>>> 100" or "150\s\sdropped". In latter cases event will be missed.
> 
>> Great question.  I'm not savvy to that '\d+' (egrep thing?), but
>> the problem you presented hits right at the core of this "text v binary output" conversation.
> 
>> Certainly, utilities may return data whose structure is not ideal
>> for every case, (even as "structure"), but depending on what needs
>> to accomplish, there are 50 years of UNIX utilites which can solve a
>> diverse number of problems users may face.
> 
>> Expanding the case you provided, lets say "myprogram" returns the
>> following "structured" output lines:
> 
>> $ myprogram
>> foo dropped 1/2 foo
>> bar sortof double! dropped bar
>> baz dropped 2/2 baz
>> bang sortof last bang
>> $
> 
> The spot point of this case  is "myprogram" with well known source . What about "outerprogram"
> or any tool from base system?
> How  to guarantee that I'll get all events of something "dropped"? Can
> I  be sure that "outerprogram" will always preserve output format and
> never write line like
> 
> "fatal: dropping all"

True, unless you wrote it, (and even then), you can't be sure "myprogram" is going to going to return consistent output.

This is the nature of inversion of control, and the common risk presented in any sufficiently complex system.

If one suspects the utility will output something unstructured, there are still an unbelievably large and powerful set of tools available to manage that output to satisfaction.

> 
> In this case my parser pipeline will completely miss this.

Exploring this for a moment:

If your parser pipeline has an absolute need to not miss things like "fatal: dropping all", one could simply look for the string "drop" not "dropped",

Given this output,

--
foo dropped 1/2 foo
bar sortof double! dropped bar
baz dropped 2/2 baz
fatal: dropping all
bang sortof last bang
--

$ myprogram | grep drop
foo dropped 1/2 foo
bar sortof double! dropped bar
baz dropped 2/2 baz
fatal: dropping all
$ 

Caught it!


Now, if the original program doesn't give enough output, (messages like "fatal: dropping all"), that could be worth investigating in the upstream program.

Next steps?
Putting on my sysadmin hat, as a user, I may investigate/replicate the event, perhaps read the man page, read the source code, and try to understand why this is happening.  Perhaps the program doing something performance constrained and can't possibly output more?  Perhaps the program simply being lazy in output, perhaps begging for improvement? Is this output a wrapper shell/other buffer issue, not even the program at all?
Is this case something which happens so rarely, and is so expensive to improve- or so difficult to maintain long-term, that it's cleaner to just anticipate the output?

If the output is not in my power to change, yet my need still exists to catch it reliably, we all have an infinitude of tools and languages to solve output structure problems to satisfaction.

Again, in any sufficiently complex system, ideal coordination between systems and subsystems is not possible.  Good software starts with trusting programs to do what they say they'll do, great software starts with a healthy mistrust programs will fail, and anticipating common failures which affect the important parts of your using them.

--
One last thought back to PHK's original point:

If the output is in CBOR format, the entire problem above still exists, but now *requires* specific programs to read and understand, before a user or program can begin to understand it.

Best,
.ike