svn commit: r315662 - in head: contrib/bsnmp/snmp_mibII contrib/ipfilter/ipsend lib/libprocstat sys/netinet sys/sys usr.bin/netstat usr.bin/sockstat usr.bin/systat usr.sbin/tcpdrop usr.sbin/trpt

Fri Mar 24 17:45:10 UTC 2017

  John,

On Tue, Mar 21, 2017 at 02:24:30PM -0700, John Baldwin wrote:
J> > I have very much anticipated this comment from you, John.
J> > 
J> > I would like to remind you, that we have had this very exact conversation
J> > back when I removed kvm support from netstat/route.c. Let me search the
J> > archives:
J> > 
J> > https://lists.freebsd.org/pipermail/svn-src-head/2015-April/070480.html
J> > 
J> > This conversation has had a continuation on IRC, which I don't archive.
J> > 
J> > AFAIR, first I told that with all my involvement into networking stack,
J> > I never ever had experienced a need to run route stats on a core. The
J> > debugger were the only useful tool. And that opinion was seconded by
J> > other network hackers. Then we discussed that a proper tool chould use
J> > dynamic type parsing and not kvm(3). You said that future gdb has python
J> > scripting and that would work fine. Meanwhile, you insisted that I restore
J> > the functionality. I resisted to put kvm(3) back into netstat/route.c, and
J> > instead I created a gdb script that prints exactly what 'nestat -anr -M core'
J> > prints. And I committed the script just to satisfy your demand:
J> > 
J> > tools/debugscripts/netstat-anr.gdb
J> > 
J> > Can you please fairly answer, have you (or anyone else) ever used the
J> > script during these 2 years?
J> 
J> You never updated crashinfo to use the script (the point of crashinfo is to
J> give an automated bit of information users can include in bug reports).
J> crashinfo came from Yahoo! where knowing the active state of the system
J> during a crash was indeed useful.  It wasn't necessarily about debugging a
J> panic in the network stack, but about obtaining information about the system
J> useful in debugging crashes in arbitrary parts of the kernel.  I don't work
J> at Y! anymore, so I'm not in the same environment.  Those things tend to be
J> more useful when dealing with a large deployment of hetergenous systems
J> rather than doing focused development on a driver or a bunch of identical
J> systems with the same workload / role (e.g. appliances).

Since you outlined that it is important that systems are heterogenous, looks
like you anticipated my reply that at Netflix we also do automated crash
collection. :)

Still, my personal experience is that when analyzing a crash, you aren't
interested in full table, be it a routing table or a PCB list. You are
focused on the entry that crashed. This experience comes from my previous
job Rambler, which is a Russian version of Yahoo! :)
All the time I analyzed our internal crashes, or FreeBSD PRs, I always
PgDown-ed this tons of information.

J> Also, the setgid thing is a red herring.  You don't need setgid to read from
J> a core, only to use kvm against a live system.  I'm all for using sysctls to
J> fetch data against live system and only keeping kvm for use with core dumps
J> which doesn't require setgid.

Which means that if you want a tool to print out stats from a core, that should
be a separate tool. And the runtime tool netstat should get free of kvm, and of
setgid bit.

Here we again come to the need of debugger with better scripting support. What
are the expectations for newer gdb which has python scripting?

-- 
Totus tuus, Glebius.