12.1-RELEASE-p7 panic in zio_free_issue_4_6

Andriy Gapon avg at FreeBSD.org
Thu Oct 29 07:46:38 UTC 2020


On 29/10/2020 09:33, Christian Kratzer wrote:
> Hi,
> 
> On Thu, 29 Oct 2020, Andriy Gapon wrote:
>> On 28/10/2020 15:41, Christian Kratzer wrote:
>>> I traced thigs down to importing one of the zpools.
>>
>> I suspect that you have a silent corruption on that pool (perhaps because of
>> non-ECC RAM?).
> 
> This is on a DL380 G7 with 128GB of ECC ram.  I have ran memtest on this server
> before without any defects being found.
> 
> The sas disks are on an LSI hba. They also do not have defects according to
> smartctl.
> 
> This of course does not rule out that there might be an issue with ram and
> I will need to recheck.
> 
> Also I suspect the server might not have enough RAM for doing dedup on this
> 2 x 7 disk raid-z2 of 1.2GB drives.
> 
> The pool was mostly in use for storing backups rsynced over night from two
> other servers.
> 
>> What you see can happen if a block pointer has a deduplication bit set, but the
>> block is not actually deduplicated or deduplication has never been enabled at
>> all.
> 
> Could I have ran into an issue and bug by trying to do too much dedup on this
> pool ?
> 
>> It would help -- with analysis -- to get a vmcore (kernel crash dump) and to
>> install the corresponding kernel debug symbols (if not already).
> 
> I need to see why this server is not producing kernel crash dumps. My other setup
> does so I should be able to get this done.
> 
>> As to recovery, I think that the best solution is to import the pool read-only
>> and to copy important data elsewhere.  Then re-create the pool.
> 
> I was about to do that but the crash also happens when trying to import read-only.
> 
> I will investigate if I can import based on an older snapshot or checkpoint but
> I am
> not sure if that will do what I want.
> 
> I will keep this pool around for a couple of days and will try to get a crash dump
> from the system.  After that I will have delete and recreate the pool and just
> wait for backups to roll back in.


Okay, let's see if we can get a vmcore.
Otherwise, this is just a guess-work on my part.
The problem could be very different from my initial impression.

-- 
Andriy




More information about the freebsd-fs mailing list