Possible kqueue related issue on STABLE/RC.

Jimmy Olgeni olgeni at olgeni.com
Wed Sep 11 16:01:24 UTC 2013


On Wed, 11 Sep 2013, Volodymyr Kostyrko wrote:

> 11.09.2013 18:07, Jimmy Olgeni wrote:
>
>> Perhaps I found something weird while running 9.2-RC3 FreeBSD
>> 9.2-RC3 #0 r255393 (ZFS-only setup).
>
>> Unfortunately I'm not able to get a minidump for the latest RC, but at this
>> point I suspect that something is going on with glib20 and kqueue on both
>> -STABLE and -RC.
>
> Can you spare some more info on this?

Sure, here it goes:

> 1. What is your /etc/src.conf and /etc/make.conf files?

My /etc/src.conf:

===
PORTS_MODULES=emulators/virtualbox-ose-kmod sysutils/fusefs-kmod sysutils/pefs-kmod x11/nvidia-driver
===

My /etc/make.conf:

===
APACHE_PORT=www/apache22
DEFAULT_PGSQL_VER=92

WITH_NEW_XORG=yes

PERL_VERSION=5.14.4

.if (!empty(.CURDIR:M/usr/src*) || !empty(.CURDIR:M/usr/obj*))
.if !defined(NOCCACHE)
CC:=          /usr/local/libexec/ccache/world/cc
CXX:=         /usr/local/libexec/ccache/world/c++
.endif
.endif
===

> 2. Does your copy of sources has some third-party patches applied?

No extra patches were applied. For the RC tests I also removed the
whole /usr/src and checked it out from svn from scratch.

Currently I have this kernel config:

===
include		GENERIC

ident		RELENG_9

device		crypto		# core crypto support
device		cryptodev	# /dev/crypto for access to h/w
device		enc		# IPsec interface.

options 	DDB		# Enable the ddb debugger backend.

options		IPSEC		# IP security (requires device crypto)
options		IPSEC_NAT_T	# NAT-T support, UDP encap of ESP
options		IPSEC_FILTERTUNNEL	# filter ipsec packets from a tunnel

options 	SC_DFLT_FONT	# compile font in
makeoptions	SC_DFLT_FONT=cp437
options 	SC_HISTORY_SIZE=512	# number of history buffer lines
options		VGA_WIDTH90	# support 90 column modes

options 	RACCT		# Resource Accounting
options 	RCTL		# Resource Limits

# altq(9). Enable the base part of the hooks with the ALTQ option.
# Individual disciplines must be built into the base system and can not be
# loaded as modules at this point. In order to build a SMP kernel you must
# also have the ALTQ_NOPCC option.
options		ALTQ
options		ALTQ_CBQ	# Class Bases Queueing
options		ALTQ_RED	# Random Early Detection
options		ALTQ_RIO	# RED In/Out
options		ALTQ_HFSC	# Hierarchical Packet Scheduler
options		ALTQ_CDNR	# Traffic conditioner
options		ALTQ_PRIQ	# Priority Queueing
options		ALTQ_NOPCC	# Required for SMP build

options		TEKEN_UTF8
===

Also, my loader.conf:

===
autoboot_delay="5"

kern.cam.ada.legacy_aliases="0"
kern.cam.scsi_delay="1500"
net.inet.ip.fw.default_to_accept="1"
vm.pmap.pg_ps_enabled="1"

ahci_load="YES"
ipmi_load="YES"
zfs_load="YES"

geom_uzip_load="YES"

hw.memtest.tests="0"
hw.usb.no_pf="1"

vm.kmem_size_max="16G"
vm.kmem_size="12G"

vfs.root.mountfrom="zfs:rpool/zfsroot"

vfs.zfs.write_limit_override="1536M"
vfs.zfs.txg.synctime_ms="750"
vfs.zfs.vdev.min_pending="1"
vfs.zfs.vdev.max_pending="1"

kern.ipc.semmns="512"
kern.ipc.semmnu="256"
kern.ipc.shmmni="256"
kern.ipc.shmseg="256"

nvidia_load="YES"
vboxdrv_load="YES"
amdtemp_load="YES"
snd_hda_load="YES"

hint.p4tcc.0.disabled="1"
hint.acpi_throttle.0.disabled="1"
===

sysctl.conf:

===
debug.kdb.break_to_debugger=1
hw.snd.default_unit=2
kern.coredump=0
kern.ipc.shm_allow_removed=1
kern.ipc.somaxconn=4096
kern.maxfiles=25000
kern.maxvnodes=250000
kern.ps_arg_cache_limit=10000
kern.sched.preempt_thresh=224
machdep.kdb_on_nmi=0
machdep.panic_on_nmi=0
net.inet.icmp.log_redirect=0
net.inet6.ip6.v6only=0
net.link.ether.inet.log_arp_movements=0
vfs.hirunningspace=5242880
vfs.read_max=128
vfs.ufs.dirhash_maxmem=33554432
vfs.usermount=1
vfs.zfs.prefetch_disable=1
===

> 3. Does this happens on more than one PC i.e. are you sure hardware
> is not involved?

First thing I thought of was either memory or the CPU temperature.

Right now I have only one PC available to test it, but:

- Memory looks ok, at least according to Memtest86/Memtest86+ (tested
   from Ultimate Boot CD)

- CPU looks ok, meaning that it can process heavy workloads without a
   problem. I tried with dev.cpu.0.freq=2200 to avoid overheating, and
   by starting 4 different poudriere builds with -J2. I have CPU
   temperature in the prompt and it hovers aroung 50C during the
   builds. Without gvfs it works just fine. Running buildworld always
   seems to work; also running sysutils/stress (stress -v -t 5m --cpu 8
   --io 4 --vm 2 --vm-bytes 128M --hdd 4) did not seem to bother the
   system.

- ZFS scrub says that it's all OK on the storage side (initially I
   thought about something going wrong with ZFS due to bad tuning).

> Can you try to build world WITH_CLANG_IS_CC? Clang generated code is
> known to produce an instant coredump in situations where gcc
> generated code hits a loop or becomes unresponsive.

I'm rebuilding a r255473 using WITH_CLANG_IS_CC=yes right now (I also
removed ccache which is the only suspicious thing I could see in my
make.conf).

I'll give it a try as soon as it's done building.

--
jimmy


More information about the freebsd-stable mailing list