FFS Softdep Kernel Panic

Shawn Wallbridge shawn.wallbridge at imaginaryforces.com
Fri Nov 1 17:31:08 UTC 2013


On 11/1/13 9:56 AM, "Kirk McKusick" <mckusick at mckusick.com> wrote:

>> From: Shawn Wallbridge <shawn.wallbridge at imaginaryforces.com>
>> To: "freebsd-fs at freebsd.org" <freebsd-fs at freebsd.org>
>> Subject: FFS Softdep Kernel Panic
>> Date: Fri, 1 Nov 2013 05:21:21 +0000
>>
>> I am running a large (71TB) file (NFS w/ some Samba) server
>>(9.2-RELEASE)
>> and it has been crashing almost daily. I have been trying to track down
>> the issue, but I haven't had any luck.
>>
>> The panic is..
>>
>> panic: handle_workitem_remove: bad file delta
>> cpuid = 9
>> KDB: stack backtrace:
>> #0 0xffffffff80947986 at kdb_backtrace+0x66
>> #1 0xffffffff8090d9ae at panic+0x1ce
>> #2 0xffffffff80b4143f at handle_workitem_remove+0x46f
>> #3 0xffffffff80b4133a at handle_workitem_remove+0x36a
>> #4 0xffffffff80b4069d at process_worklist_item+0x2bd
>> #5 0xffffffff80b450da at softdep_process_worklist+0x8a
>> #6 0xffffffff80b47a4d at softdep_flush+0x1ad
>> #7 0xffffffff808db67f at fork_exit+0x11f
>> #8 0xffffffff80cdc23e at fork_trampoline+0xe
>>
>>
>> I looked at the source for ffs_softdep.c and found this, which seems to
>>be
>> the only place "bad file delta" shows up.
>>
>>  /*
>>          * Normal file deletion.
>>          */
>>         if ((dirrem->dm_state & RMDIR) == 0) {
>>                 ip->i_nlink--;
>>                 DIP_SET(ip, i_nlink, ip->i_nlink);
>>                 ip->i_flag |= IN_CHANGE;
>>                 if (ip->i_nlink < ip->i_effnlink)
>>                         panic("handle_workitem_remove: bad file delta");
>>                 if (ip->i_nlink == 0)
>>                         unlinked_inodedep(mp, inodedep);
>>                 inodedep->id_nlinkdelta = ip->i_nlink - ip->i_effnlink;
>>                 KASSERT(LIST_EMPTY(&dirrem->dm_jwork),
>>                     ("handle_workitem_remove: worklist not empty. %s",
>>                     TYPENAME(LIST_FIRST(&dirrem->dm_jwork)->wk_type)));
>>                 WORKITEM_FREE(dirrem, D_DIRREM);
>>                 FREE_LOCK(&lk);
>>                 goto out;
>>         }
>>
>> I have created a PR, but I haven't had any response (no one has even
>> downloaded the crash dumps I linked to).
>>
>> http://www.freebsd.org/cgi/query-pr.cgi?pr=183424
>>
>> Because this is a file server, in production, this is becoming a HUGE
>> problem and is costing us quite a bit of lost production each time it
>> crashes (and takes 4hrs to fsck).
>>
>> Thanks
>> shawn
>
>I have taken a look at your bug report and have a couple of questions
>about your system:
>
>Your kernel was built at the end of September. Has this problem
>persisted since that kernel was build, or has it showed up more recently?
>
>Are you running with journaled soft updates or just regular soft
>updates? You can use the mount command with no arguments to find out.
>
>       Kirk McKusick
>

Thank you.

This machine was originally built using 9.1-RELEASE, which had the problem
as well, so I updated to 9.2-RELEASE to try to resolve the issue.

I am running both,

/dev/da0p1 on /sam (ufs, NFS exported, local, journaled soft-updates)

I am building a kernel with invariants right now.

shawn



________________________________

This e-mail is intended only for the named person or entity to which it is addressed and contains valuable business information that is proprietary, privileged, confidential and/or otherwise protected from disclosure. If you received this e-mail in error, any review, use, dissemination, distribution or copying of this e-mail is strictly prohibited. Please notify us immediately of the error via e-mail to <ifpostmaster> postmaster at imaginaryforces.com and please delete the e-mail from your system, retaining no copies in any media. We appreciate your cooperation.

...imaginaryforces.com...



More information about the freebsd-fs mailing list