9.2-RELEASE Kernel panic, mbuf underflow
Shawn Wallbridge
shawn at wallbridge.net
Mon Nov 4 22:52:53 UTC 2013
On Nov 2, 2013, at 12:54 PM, John-Mark Gurney <jmg at funkthat.com> wrote:
> Shawn Wallbridge wrote this message on Tue, Oct 29, 2013 at 21:37 -0700:
>> I have a file server that keeps panic?ing with a mbuf cluster in the 17 Quadrillion range (2^64 - 2). I am pretty sure it?s a buffer underflow.
>
> Ok, after some tracking stuff down, I do not think it has anything to
> do w/ mbufs, as the stats appear to be correct... The problem is that
> mbuf clusters takes into the fact that some clusters might be still
> associated w/ packets (from usr.bin/netstat/mbuf.c):
> printf("%ju/%ju/%ju/%ju mbuf clusters in use "
> "(current/cache/total/max)\n",
> cluster_count - packet_free, cluster_free + packet_free,
> cluster_count + cluster_free, cluster_limit);
>
> notice how current is cluster_count - packet_free instead of something
> like cluster_count - cluster_free... And I just printed your values
> from vmcore.6, and apparently packet_count is 0, while packet_free is
> 5215...
>
> cluster_count is 2049, cluster_free is 1997..
>
> And because packet is a secondary zone of mbufs, things apparently get
> confused... So I wouldn't go down this road anymore... This looks
> like a simple race/accounting error in the status...
>
>> I have opened a PR, but I haven?t had any movement on it. This happened while I was running 9.1-RELEASE as well.
>>
>> Here is the PR..
>>
>> http://www.freebsd.org/cgi/query-pr.cgi?pr=183424
>>
>> And I have uploaded the crash dumps here..
>>
>> http://www.wallbridge.net/crash/
>>
>> If anyone has any ideas, I would be grateful as this is a production box and it?s really impacting us.
>
> Have you done a full fsck on the fs to make sure that there isn't any
> corruption on the disk that keeps popping up? I do realize that it
> will take a LONG time to fsck... Sadly, you're last three cores
> (all on 9.2-R) are for different inodes...
>
> Could you tell me the path and filename of inodes: 3226539015,
> 3224134148 and 3343904256? It could help us track down which app is
> causing this and being able to reproduce this...
>
> To find the inode on the fs use find <fs> -inum <inum>, so:
> find <fs> -inum 3226539015 -or -inum 3224134148 -or -inum 3343904256
>
> will do it in one pass so it won't take so long...
>
> Thanks.
>
> --
> John-Mark Gurney Voice: +1 415 225 5579
>
> "All that I will do, has been done, All that I have, has not."
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org”
Just wanted to update the list on this,
The machine has now crashed with the INVARIANTS kernel, the kernel dumps are here..
wallbridge.net/crash/20131104/core.txt.4.gz
wallbridge.net/crash/20131104/info.4.gz
wallbridge.net/crash/20131104/vmcore.4.gz
shawn
More information about the freebsd-stable
mailing list