Kernel panic when moving lots of data over network
Andrew Kinney
andykinney at advantagecom.net
Mon Jul 14 17:31:17 PDT 2003
On 9 Jul 2003, at 10:53, John Fox wrote:
> Strange problem on a new server we're setting up. It's very stable,
> except when moving a large amount of data onto it via the network. I
> begin moving approx 4GB of data onto it, and before the xfer can
> complete, the system panics and reboots. (I am generally able to get
> from 1 to 2 GB transferred before the panic occurrs.)
>
I'm not really a kernel hacker, but I've solved lots of our own kernel
problems on 4.5 release, 4.7 release, and 4.8 release with the help
of others on this list.
We haven't had any problems exactly like what you described, but
I seem to remember some open PRs relating to SSH and/or the xl
network driver causing panics. You might want to browse through
them and see if any match your situation. FWIW, though, we run
4.8-RELEASE, SSH, and the xl driver (3com 905C-TX, I believe) on
one of our heavily used dual CPU machines and don't have any
problems, so I'd be surprised if any of those PRs had any bearing
on this. We don't do any large file transfers over SSH, though. We
usually use rsync for that since we deal with lots of little files that
get out of synch easily.
> #6 0xc021745f in xl_newbuf ()
> #7 0xc021761e in xl_rxeof ()
> #8 0xc0219296 in xl_watchdog ()
> #9 0xc01b662f in if_slowtimo ()
> #10 0xc0180799 in softclock ()
Here's some slightly educated guesses that you'll want to eliminate
until you isolate the trouble:
1. My experience is that a lot of "trap 12" seem to come from
running out of some hard limited kernel resource. Try logging the
sysctl vm.zone once a minute through cron to see if you're
bumping any of those limits. You'll also want to try logging sysctl
kvm_free in the same manner to make sure you're not running out
of KVA or KVM. Our system is setup with 2GB KVA (default is
1GB) which solved all the trap 12 issues our system was having
due to running out of KVA/KVM.
2. Check your RAM. Bad RAM caused us innumerable
headaches from seemingly random trap 12 problems on one of our
other systems. Usually hit on some buffer allocation, especially
when that was the primary activity in RAM. SSH is especially
sensitive to bad RAM. We could usually trigger a panic on a
system with bad RAM just by excercising SSH a bit.
3. Some unknown or known problem with the xl driver and long file
transfers over SSH. Check those PRs (sorry, don't know the
numbers off hand).
Sincerely,
Andrew Kinney
President and
Chief Technology Officer
Advantagecom Networks, Inc.
http://www.advantagecom.net
More information about the freebsd-hackers
mailing list