[PATCH] Netdump for review and testing -- preliminary version

Attilio Rao attilio at freebsd.org
Wed Oct 13 11:04:57 UTC 2010

2010/10/9 Robert Watson <rwatson at freebsd.org>:
> On Fri, 8 Oct 2010, Attilio Rao wrote:
>>> Netdump is composed, right now, by an userland "server" and a kernel
>>> "client". The former is run on the target machine (where the dump will
>>> phisically happen) and it is responsible for receiving  the packets
>>> containing coredumps frame and for correctly writing them on-disk. The
>>> latter is part of the kernel installed on the source machine (where the dump
>>> is initiated) and is responsible for building correctly UDP packets
>>> containing the coredump frames, pushing through the network interface and
>>> routing them appropriately.
> Hi Attilio:
> Network dumps would be a great addition to the FreeBSD debugging suite!  A
> few casual comments and questions, I need to spend some more time pondering
> the implications of the current netdump design later in the week.
> (1) Did you consider using tftp as the network dump protocol, rather than a
> custom protocol?  It's also a simple UDP-based, ACKed file transfer
> protocol, with the advantage that it's widely supported by exiting tftp
> daemons, packet sniffers, security devices, etc.  This wouldn't require
> using a stock tftpd, although that would certainly be a benefit as well.
>  The post-processing of crashdumps that seems to happen in the netdump
> server could, presumably, happen "offline" as easily...?

I don't understand the "offline" question but, really, I don't know
why tftp wasn't considered in the first place.
Let me do some small search and see how much we could benefit from it.

> (2) Have you thought about adding a checksum to the dump format, since
> packets are going over the wire on UDP, etc?  Even an MD5 sum wouldn't be
> too hard to arrange: all the code is in the kernel already, requires
> relatively little state storage, and is designed for streamed data.

Someone else already brought this point and I agree, we could use a
checksum here.

> (3) As the bounds checking/etc behavior in dumping grows more complex, it
> seems a shame to replicate it in architecture-specific code.  Could we pull
> the mediaoffset/mediasize checking into common code?  Also, a reserved
> size/offset of "0" meaning "no limit" sounds slightly worrying; it might be
> slightly more conservative to add a flags field to dumperinfo and have a
> flag indicating "size is no object" for netdump, rather than overloading the
> existing fields.

I don't agree with you here.
The real problem is that dumpsys is highly disk-specific (I've
commented about it somewhere, maybe the e-mail or maybe the commit
What we'd need is to have something like netdumpsys() which tries to
use network, but it is not short to make and the mediasize+mediaoffset
concept really rappresents an higher bound which can't be 0. I think
it is a reasonable compromise so far but it is subjected to further
improvements for sure.

> Some specific patch comments:
> + * XXX: This should be split into machdep and non-machdep parts
> What MD parts are in the file?

This is just a stale comment.

> The sysctl parts of the patch have a number of issues:

Sysctl are still not overhauled just because I'm not sure we want to
keep them. That is one of the points I raised in the main e-mail and
I'd really would like to hear opinions about if we should keep them
rather than having a setup userland process, etc.
I'm sorry about this, but please keep in mind that the file still
needs a lot of cleanup so some comments are a bit out of date and
other parts may not be still perfectly overhauled.

> +sysctl_force_crash(SYSCTL_HANDLER_ARGS)
> Does this belong in the netdump code?  We already have some of these options
> in debug.kdb.*, but perhaps others should be added there as well.

For this specific case, I'd really axe it out rather.

> +               /*
> +                * get and fill a header mbuf, then chain data as an
> extended
> +                * mbuf.
> +                */
> +               MGETHDR(m, M_DONTWAIT, MT_DATA);
> The idea of calling into the mbuf allocator in this context is just freaky,
> and may have some truly awful side effects.  I suppose this is the cost of
> trying to combine code paths in the network device driver rather than have
> an independent path in the netdump case, but it's quite unfortunate and will
> significantly reduce the robustness of netdumps in the face of, for example,
> mbuf starvation.

I'm not sure in which other way we could fix that actually. Maybe a
pre-allocated pool of mbufs?

> +       if (ntohs(ah->ar_hrd) != ARPHRD_ETHER &&
> +           ntohs(ah->ar_hrd) != ARPHRD_IEEE802 &&
> +           ntohs(ah->ar_hrd) != ARPHRD_ARCNET &&
> +           ntohs(ah->ar_hrd) != ARPHRD_IEEE1394) {
> +               NETDDEBUG("nd_handle_arp: unknown hardware address fmt "
> +                   "0x%2D)\n", (unsigned char *)&ah->ar_hrd, "");
> +               return;
> +       }
> Are you sure you don't want to just check for ETHER here?

I'd really like to hear Ryan's or Ed's idea here.

> +       /* XXX: Probably should check if we're the recipient MAC address */
> +       /* Done ethernet processing. Strip off the ethernet header */
> Yes, quite probably.  What if you panic in promiscuous mode?
> +       /*
> +        * The first write (at offset 0) is the kernel dump header.  Flag it
> +        * for the server to treat specially.  XXX: This doesn't strip out
> the
> +        * footer KDH, although it shouldn't hurt anything.
> +        */
> The footer allows us to confirm that the tail end of a dump matches the
> beginning; while probably not strictly necessary in this scenario, it's not
> a bad idea given the lack of a checksum.

So I assume you are in favor of it, right?

> +       /* Default the nic to the first available interface */
> +       if ((ifn = TAILQ_FIRST(&ifnet)) != NULL) do {
> +               if ((ifn->if_flags & IFF_UP) == 0)
> +                       continue;
> More locking needed.

Please refer to the second patch I posted.
It should have proper locking and actually this code is just trimmed
(more locking in V_ifnet accessings, but not in this codepath).

> -       void    *if_pspare[7];
> +       struct  netdump_methods *if_ndumpfuncs; /* netdump virtual methods
> */
> +       void    *if_pspare[6];
>        int     if_ispare[4];
> In HEAD, at least, you should add your field and not use the spare.  The
> spare should only be used on a merge to a KBI-stable branch (such as 8.x).

Thanks, for picking this.

> I need to ponder your changes to ifnet and to the drivers some more; I
> recognize the benefits of maximal code alignment, but I worry a lot about
> these code paths having side effects (not least, calls into memory
> allocators, which in turn can enter VM, etc).  I wonder if, given that, we
> need to teach the mbuf allocator to be much more conservative if netdumping
> is active...

Sorry, which calls from drivers should get in the memory allocator?
Are you just referring to the mbuf headers allocation?
Changes to the drivers are mostly stright-forward -- they just try to
do polling in locked or unlocked mode, following DDB entering or not
(basically how other DDB-agnostic routines already do in FreeBSD for
the known locking problems we discussed several times in the past).

Thanks for your valuable feedback, looking forward to hear more.


Peace can only be achieved by understanding - A. Einstein

More information about the freebsd-net mailing list