repeating crashes with 8.1

Sat Oct 23 04:42:02 UTC 2010

Odd, can you make any connection between this and the em complaints??

Jack

On Fri, Oct 22, 2010 at 6:59 PM, Mike Tancsa <mike at sentex.net> wrote:

> At 09:11 PM 10/22/2010, Mike Tancsa wrote:
>
>> At 08:01 PM 10/22/2010, Chris Morrow wrote:
>>
>>> Note, Warren and I attempted to test this this evening on a 10.04 Ubuntu
>>> box, no crashy-crashy...
>>>
>>
>>
> I was able to trigger the issue on box (c).  I was ping6ing box (a) when I
> did a hard down of (d)'s connected interface. The box then dropped to
> debugger
>
>
> Fatal trap 9: general protection fault while in kernel mode
> cpuid = 0; apic id = 00
> instruction pointer     = 0x20:0xffffffff80740a50
> stack pointer           = 0x28:0xffffff800005a890
> frame pointer           = 0x28:0xffffff800005a930
>
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                        = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 12 (swi4: clock)
> [thread pid 12 tid 100007 ]
> Stopped at      in6_cksum+0x410:        movzwl  (%rsi),%r10d
> db> bt
> Tracing pid 12 tid 100007 td 0xffffff00025083e0
> in6_cksum() at in6_cksum+0x410
> icmp6_reflect() at icmp6_reflect+0x312
> icmp6_error() at icmp6_error+0x1ec
> nd6_llinfo_timer() at nd6_llinfo_timer+0x208
> softclock() at softclock+0x2a6
> intr_event_execute_handlers() at intr_event_execute_handlers+0x66
> ithread_loop() at ithread_loop+0xb2
> fork_exit() at fork_exit+0x12a
> fork_trampoline() at fork_trampoline+0xe
> --- trap 0, rip = 0, rsp = 0xffffff800005ad30, rbp = 0 ---
> db>
>
>
>
>
>  I was able to do it, but not the box I expected
>>
>> 4 boxes
>>
>> (a) Attacking host 2001:db8:1:1/64
>> (b) victim, not on a connected interface with a). Outside interface - em0
>> - 2001:db8::2:1/64, inside interface - em1 - 2001:db8::3:1/64
>> (c) a host behind (b) 2001:db8::3:c/64
>> (d) a host behind (b), 2001:db8::3:d/64
>>
>>
>> hosts (c) and (d) have default gateways to b).  (c) however, has a next
>> hop for (a) via (d).  So rather than go out its normal default gateway, it
>> takes an extra hop via (d).
>>
>> Start a ping6 from (a) to (c).  Then down (d)'s interface so that the
>> ping6 fails.  Let the ping keep running for an hour or two.  Eventually (b)
>> gets error messages like
>>
>> Oct 22 18:38:32 zoo kernel: em1: discard frame w/o packet header
>>
>> and crashes.
>>
>> Unfortunately, I thought it would be (c) that crapped out, not (b) and I
>> didnt have crash dumps enabled on the host.  Just in the process of setting
>> up a better environment.
>>
>>        ---Mike
>>
>>  -chris
>>>
>>> On 10/22/10 16:27, Joel Jaeggli wrote:
>>> > Ok I'll try testing that on some box I can reach with both hands.
>>> >
>>> > fyi nagasaki is:
>>> >
>>> > [root at nagasaki ~]# uname -a
>>> > FreeBSD nagasaki.bogus.com 8.1-PRERELEASE FreeBSD 8.1-PRERELEASE #13:
>>> > Sun May 30 22:19:23 UTC 2010
>>> > root at nagasaki.bogus.com:/usr/obj/usr/src/sys/GENERIC  i386
>>> > [root at nagasaki ~]#
>>> >
>>> >
>>> > On 10/22/10 1:17 PM, Randy Bush wrote:
>>> >>>>>>> Do you know how this panic is triggered ? Are you able to
>>> >>>>>>> create it on demand ?
>>> >>>>>>
>>> >>>>>> no i do not.  bring server up and it'll happen in half an hour.
>>> >>>>>> and the server was happy for two months.  so i am thinking
>>> hardware.
>>> >>>>>
>>> >>>>> Perhaps. The reason I ask is that I had a box go down last night
>>> with
>>> >>>>> the same set of errors.  The box has a number of ipv6 routes, but
>>> its
>>> >>>>> next hop was down and the problems started soon after. So I wonder
>>> if
>>> >>>>> it has something to do with that.  Do you have ipv6 on this box and
>>> >>>>> are all the next hop addresses correct / reachable ?
>>> >>>>>
>>> >>>>> Oct 22 02:06:02 i4 kernel: em1: discard frame w/o packet header
>>> >>>>> Oct 22 02:06:10 i4 kernel: em2: discard frame w/o packet header
>>> >>>>> Oct 22 02:06:21 i4 kernel: em1: discard frame w/o packet header
>>> >>>>
>>> >>>> it was co-incident with a border router being taken down for new
>>> router
>>> >>>> install.  that router was the v6 exit the servers was using.  i have
>>> now
>>> >>>> pointed default6 to a different exit.  the server seems happy.
>>> >>>
>>> >>>
>>> >>> Are you servers still up ?  I guess the question now is how to
>>> >>> trigger this problem on demand.  Perhaps lots of inbound ipv6 traffic
>>> >>> with a bad next hop out ?  How recent are you sources ?  The kernel
>>> >>> said Oct 21st. Were the sources from then too ?
>>> >>
>>> >> yes, kernel and world from 21 oct
>>> >>
>>> >> chris had an idea on retrigger, install a static for a small dest that
>>> >> points to a hole.  send a packet to the small dest.
>>> >>
>>> >> randy
>>> >>
>>>
>>
>> --------------------------------------------------------------------
>> Mike Tancsa,                                      tel +1 519 651 3400
>> Sentex Communications,                            mike at sentex.net
>> Providing Internet since 1994                    www.sentex.net
>> Cambridge, Ontario Canada                         www.sentex.net/mike
>>
>
> --------------------------------------------------------------------
> Mike Tancsa,                                      tel +1 519 651 3400
> Sentex Communications,                            mike at sentex.net
> Providing Internet since 1994                    www.sentex.net
> Cambridge, Ontario Canada                         www.sentex.net/mike
>
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
>