kern/147890: [regression] ufs-related lock problem in RELENG_8
(18.04.2010 -> 20.04.2010)
Dmitry Pryanishnikov
lynx.ripe at gmail.com
Tue Jun 15 20:40:01 UTC 2010
>Number: 147890
>Category: kern
>Synopsis: [regression] ufs-related lock problem in RELENG_8 (18.04.2010 -> 20.04.2010)
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Tue Jun 15 20:40:01 UTC 2010
>Closed-Date:
>Last-Modified:
>Originator: Dmitry Pryanishnikov
>Release: RELENG_8
>Organization:
Home
>Environment:
FreeBSD lynx.homenet 8.0-STABLE FreeBSD 8.0-STABLE #0: Tue Jun 15 15:05:05 EEST 2010 root at lynx.homenet:/databig/obj/databig/ftp/RELENG_8/src.100418/sys/lynx i386
>Description:
After upgrade RELENG_8 as of 10.04.2010 -> 14.06.2010 my desktop (previously running w/o problems for days) started to lock pretty fast (within 1-2 hours of uptime). I've narrowed down the time window of the problem appearance: kernel built from the sources as of date=2010.04.18.00.00.00 runs OK, one as of date=2010.04.20.00.00.00 already locks. Nothing "special" is required in order to trigger the condition: after boot I just log in as a non-root (via xdm - it starts xfce4 environment), then start the Thunar File Manager and GQview. Then I switch to text console (crash dump often doesn't complete when console is running X) and just do 'ps axww|grep D' from time to time. Eventually processes thunar-vfs-mime-cleaner-1 (stuck in uninterruptible wait state) start to accumulate. In this state, simple 'sync' also hangs the issuing shell, and (the most painful result) file operations (file creations/moves/...) don't get synched to the disk. Then deadlock resolver (after debug.deadl
kres.slptime_threshold seconds) detects the deadlock, which allows to get crash dump. I've got valid dumps both for kernel built from 14.06.2010 and from 20.04.2010 sources, they look similar. Here is the crash of the latter:
panic: deadlkres: possible deadlock detected for 0xcb52e4a0, blocked for 294235 ticks
(I've lowered debug.deadlkres.slptime_threshold manually after detecting hung thunar-vfs-mime-cleaner-1 processes)
(kgdb) print *(struct thread *)0xcb52e4a0
$1 = {td_lock = 0xc07db49c, td_proc = 0xcb5002a8, td_plist = {tqe_next = 0x0, tqe_prev = 0xcb5002b0}, td_runq = {tqe_next = 0x0, tqe_prev = 0xc07ce834}, td_slpq = {tqe_next = 0xcb52db90,
tqe_prev = 0xc7383018}, td_lockq = {tqe_next = 0x0, tqe_prev = 0x0}, td_cpuset = 0xc702ce38, td_sel = 0x0, td_sleepqueue = 0x0, td_turnstile = 0xc7ca2c00, td_umtxq = 0xcabc3940, td_tid = 100186,
td_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = {tqh_first = 0x0, tqh_last = 0xcb52e500}, sq_proc = 0xcb5002a8, sq_flags = 1}, td_flags = 4,
td_inhibitors = 2, td_pflags = 0, td_dupfd = -1, td_sqqueue = 0, td_wchan = 0xcb3a9df4, td_wmesg = 0xc0735016 "ufs", td_lastcpu = 2 '\002', td_oncpu = 255 'Ъ', td_owepreempt = 0 '\0', td_tsqueue = 0 '\0',
td_locks = 0, td_rw_rlocks = 0, td_lk_slocks = 0, td_blocked = 0x0, td_lockname = 0x0, td_contested = {lh_first = 0x0}, td_sleeplocks = 0xc0908618, td_intr_nesting_level = 0, td_pinned = 0,
td_ucred = 0xc7668580, td_estcpu = 0, td_slptick = 703512, td_blktick = 0, td_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 224, ru_ixrss = 12, ru_idrss = 4,
ru_isrss = 128, ru_minflt = 145, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 1, ru_nivcsw = 10}, td_incruntime = 0,
td_runtime = 23985873, td_pticks = 0, td_sticks = 0, td_iticks = 0, td_uticks = 0, td_intrval = 0, td_oldsigmask = {__bits = {0, 0, 0, 0}}, td_sigmask = {__bits = {0, 0, 0, 0}}, td_generation = 11,
td_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 4}, td_xsig = 0, td_profil_addr = 0, td_profil_ticks = 0, td_name = "initial thread\000-cle", td_fpop = 0x0, td_dbgflags = 0, td_ng_outbound = 0, td_osd = {
osd_nslots = 0, osd_slots = 0x0, osd_next = {le_next = 0x0, le_prev = 0x0}}, td_rqindex = 32 ' ', td_base_pri = 128 '\200', td_priority = 128 '\200', td_pri_class = 3 '\003', td_user_pri = 128 '\200',
td_base_user_pri = 128 '\200', td_pcb = 0xeb6a3d90, td_state = TDS_INHIBITED, td_retval = {0, 1}, td_slpcallout = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0,
c_arg = 0x0, c_func = 0, c_lock = 0x0, c_flags = 16, c_cpu = 0}, td_frame = 0xeb6a3d38, td_kstack_obj = 0xcb4ca3b8, td_kstack = 3949600768, td_kstack_pages = 2, td_unused1 = 0x0, td_unused2 = 0,
td_unused3 = 0, td_critnest = 1, td_md = {md_spinlock_count = 1, md_saved_flags = 582}, td_sched = 0xcb52e6bc, td_ar = 0x0, td_syscalls = 11802, td_lprof = {{lh_first = 0x0}, {lh_first = 0x0}},
td_dtrace = 0x0, td_errno = 0, td_vnet = 0x0, td_vnet_lpush = 0x0}
(kgdb) print *(struct proc *)0xcb5002a8
$2 = {p_list = {le_next = 0xcabb3d48, le_prev = 0xcabb1000}, p_threads = {tqh_first = 0xcb52e4a0, tqh_last = 0xcb52e4a8}, p_slock = {lock_object = {lo_name = 0xc073b098 "process slock", lo_flags = 720896,
lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, p_ucred = 0xc7668580, p_fd = 0xc781b900, p_fdtol = 0x0, p_stats = 0xc781ce00, p_limit = 0xc7826500, p_limco = {c_links = {sle = {sle_next = 0x0}, tqe = {
tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_lock = 0xcb500330, c_flags = 0, c_cpu = 0}, p_sigacts = 0xcabea000, p_flag = 268451840, p_state = PRS_NORMAL, p_pid = 2044,
p_hash = {le_next = 0x0, le_prev = 0xc6fe9ff0}, p_pglist = {le_next = 0xcabb3d48, le_prev = 0xc7800db8}, p_pptr = 0xc7043000, p_sibling = {le_next = 0xcabb3d48, le_prev = 0xc7043084}, p_children = {
lh_first = 0x0}, p_mtx = {lock_object = {lo_name = 0xc073b08b "process lock", lo_flags = 21168128, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, p_ksi = 0xc750f410, p_sigqueue = {sq_signals = {__bits = {
0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = {tqh_first = 0x0, tqh_last = 0xcb500368}, sq_proc = 0xcb5002a8, sq_flags = 1}, p_oppid = 0, p_vmspace = 0xc76febc8, p_swtick = 703502,
p_realtimer = {it_interval = {tv_sec = 0, tv_usec = 0}, it_value = {tv_sec = 0, tv_usec = 0}}, p_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 0,
ru_ixrss = 0, ru_idrss = 0, ru_isrss = 0, ru_minflt = 0, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 0, ru_nivcsw = 0}, p_rux = {
rux_runtime = 23985873, rux_uticks = 0, rux_sticks = 1, rux_iticks = 0, rux_uu = 0, rux_su = 9993, rux_tu = 9993}, p_crux = {rux_runtime = 0, rux_uticks = 0, rux_sticks = 0, rux_iticks = 0, rux_uu = 0,
rux_su = 0, rux_tu = 0}, p_profthreads = 0, p_exitthreads = 0, p_traceflag = 0, p_tracevp = 0x0, p_tracecred = 0x0, p_textvp = 0xcb524218, p_lock = 0, p_sigiolst = {slh_first = 0x0}, p_sigparent = 20,
p_sig = 0, p_code = 0, p_stops = 0, p_stype = 0, p_step = 0 '\0', p_pfsflags = 0 '\0', p_nlminfo = 0x0, p_aioinfo = 0x0, p_singlethread = 0x0, p_suspcount = 0, p_xthread = 0x0, p_boundary_count = 0,
p_pendingcnt = 0, p_itimers = 0x0, p_magic = 3203398350, p_osrel = 800504, p_comm = "thunar-vfs-mime-cle", p_pgrp = 0xc7bd7040, p_sysent = 0xc07a4220, p_args = 0xc7bd6d40, p_cpulimit = 9223372036854775807,
p_nice = 0 '\0', p_fibnum = 0, p_xstat = 0, p_klist = {kl_list = {slh_first = 0x0}, kl_lock = 0xc04d47e0 <knlist_mtx_lock>, kl_unlock = 0xc04d4810 <knlist_mtx_unlock>,
kl_assert_locked = 0xc04d4b60 <knlist_mtx_assert_locked>, kl_assert_unlocked = 0xc04d4b30 <knlist_mtx_assert_unlocked>, kl_lockarg = 0xcb500330}, p_numthreads = 1, p_md = {md_ldt = 0x0}, p_itcallout = {
c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_lock = 0x0, c_flags = 16, c_cpu = 0}, p_acflag = 0, p_peers = 0x0, p_leader = 0xcb5002a8,
p_emuldata = 0x0, p_label = 0x0, p_sched = 0xcb500550, p_ktr = {stqh_first = 0x0, stqh_last = 0xcb500538}, p_mqnotifier = {lh_first = 0x0}, p_dtrace = 0x0, p_pwait = {cv_description = 0xc073b8cc "ppwait",
cv_waiters = 0}}
I'm ready to run any additional kgdb commands against those crash dumps, to try patches on this desktop etc...
Here is the kernel config:
ident LYNX
makeoptions DEBUG=-g #Build kernel with gdb(1) debug symbols
options INCLUDE_CONFIG_FILE # Include this file in kernel
cpu I686_CPU # aka Pentium Pro(tm)
options SCHED_ULE
options SMP # Symmetric MultiProcessor Kernel
options PREEMPTION # Enable kernel thread preemption
options MUTEX_DEBUG
options WITNESS
device apic # I/O apic
options COMPAT_FREEBSD4
options COMPAT_FREEBSD5
options COMPAT_FREEBSD6
options COMPAT_FREEBSD7
options SYSVSHM
options SYSVSEM
options SYSVMSG
options KDB
options KDB_TRACE
options DDB
options DDB_NUMSYM
options BREAK_TO_DEBUGGER # BREAK on a serial console goes to DDB
options SYSCTL_DEBUG
options KTRACE #kernel tracing
options INVARIANTS
options INVARIANT_SUPPORT
options DEADLKRES #software deadlock resolver thread
options INET #Internet communications protocols
options INET6 #IPv6 communications protocols
options IPSEC #IP security (requires device crypto)
options IPSEC_NAT_T #NAT-T support, UDP encap of ESP
device ether #Generic Ethernet
device loop #Network loopback device
device bpf #Berkeley packet filter
device enc #IPsec interface
options IPFIREWALL #firewall
options IPFIREWALL_VERBOSE #enable logging to syslogd(8)
options IPFIREWALL_VERBOSE_LIMIT=100 #limit verbosity
options IPFIREWALL_FORWARD #packet destination changes
options IPFIREWALL_NAT #ipfw kernel nat support
options LIBALIAS # libalias library, performing NAT
options BPF_JITTER #support for BPF just-in-time compiler
options FFS #Fast filesystem
options SOFTUPDATES
options UFS_EXTATTR
options UFS_ACL
options UFS_DIRHASH
options UFS_GJOURNAL # Gjournal-based UFS journaling support.
options QUOTA #enable disk quotas
options SUIDDIR #The suiddir hack
options _KPOSIX_PRIORITY_SCHEDULING
options P1003_1B_SEMAPHORES
device pty #Pseudo ttys
options MSGBUF_SIZE=131072
device crypto # core crypto support
device pci
device atkbdc # AT keyboard controller
device atkbd # AT keyboard
device psm # PS/2 mouse
options KBD_INSTALL_CDEV # install a CDEV entry in /dev
device vga # VGA video card driver
device splash # Splash screen and screen saver support
options X86BIOS # x86 real mode BIOS emulator, required by atkbdc/dpms/vesa
options PRINTF_BUFR_SIZE=128 # Prevent printf output being interspersed.
device sc
options SC_HISTORY_SIZE=20000
options SC_TWOBUTTON_MOUSE
device uart # Generic UART driver
device ata
device atadisk # ATA disk drives
options ATA_STATIC_ID # Static device numbering
device scbus # SCSI bus (required for SCSI)
device da # Direct Access (disks)
device cd # CD
device pass # Passthrough device (direct SCSI access)
options SND_PCM_64 # Process 32bit samples through 64bit integer
And the loader.conf:
autoboot_delay="5" # Delay in seconds before autobooting
beastie_disable="YES" # Turn the beastie boot menu on and off
kernel="kernel" # /boot sub-directory containing kernel and modules
kern.hz="1000" # Set the kernel interval timer rate
kern.maxdsiz="1G" # Set the max data size
kern.maxssiz="1G" # Set the max stack size
kern.maxtsiz="1G" # Set the max text size
kern.ipc.semmni=80 # Number of semaphore identifiers
kern.ipc.semmns=480 # Maximum number of semaphores in the system
hw.ata.atapi_dma=1 # Using DMA for ATAPI devices
hw.snd.verbose=2 # Make /dev/sndstat more verbose
hw.snd.maxautovchans=4 # Enable software sound mixer
hint.uart.0.flags="0x10" # device is potential system console
debug.minidump=1 # Use minidumps
debug.witness.watch=0 # Turn off the witness
ipdivert_load="YES"
accf_data_load="YES" # Wait for data accept filter
accf_dns_load="YES" # Wait for full DNS request accept filter
accf_http_load="YES" # Wait for full HTTP request accept filter
if_tap_load="YES" # Ethernet tunnel software network interface
if_tun_load="YES" # Tunnel driver (user process ppp)
if_vlan_load="YES" # IEEE 802.1Q VLAN network interface
miibus_load="YES" # miibus support, needed for some drivers
if_em_load="YES" # Intel(R) PRO/1000 Gigabit Ethernet
snd_hda_load="YES" # Intel High Definition Audio
usb_load="YES" # USB subsystem
uhci_load="YES" # UHCI USB Host Controller driver
ehci_load="YES" # USB Enhanced Host Controller driver
ugen_load="YES" # USB generic device, if all else fails ...
uhid_load="YES" # Human Interface Devices
ukbd_load="YES" # Keyboard
ulpt_load="YES" # Printer
ums_load="YES" # Mouse
umass_load="YES" # Mass Storage Devices
ucom_load="YES" # USB tty support:
uplcom_load="YES" # Prolific PL-2303/2303X/2303HX serial adapters
uftdi_load="YES" # FTDI FT2232C, FT8U100AX and FT8U232AM chips
uark_load="YES" # Arkmicro Technologies ARK3116 based USB serial adapter
snd_uaudio_load="YES" # USB audio
umodem_load="YES" # Modems
random_load="YES"
cpufreq_load="YES"
coretemp_load="YES"
ichwd_load="YES"
smbus_load="YES"
smb_load="YES"
ichsmb_load="YES"
speaker_load="YES"
atapicam_load="YES"
fdc_load="YES"
drm_load="YES"
radeon_load="YES"
vesa_load="YES"
acpi_video_load="YES"
firewire_load="YES"
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list