FreeBSD 11.x grinds to a halt after about 48h of uptime
Hans Petter Selasky
hps at selasky.org
Sat Oct 15 16:21:28 UTC 2016
On 10/15/16 18:18, Ulrich Spörlein wrote:
> Hey all, while 11.x is -STABLE now, this happens to my machine ever
> since I upgraded it to 11-CURRENT years ago. I have no idea when this
> started, actually, but what always happens is this:
>
> - System and X11 is up and running, I keep it running over night as I'm
> too lazy to reboot and restart everthing.
> - There's a bunch of xterms, Chrome, Clementine-Player and some other
> programs running
> - Coming back to the machine the next day (or the day after) it will
> exit the screensaver just fine and then either I can use it for a couple
> of seconds before it freezes, or it's pretty much dead already. The
> mouse cursor still moves for a bit, but the also freezes (so it this a
> GPU problem??)
>
> Now what I currently see on the screen is a clock widget stuck at 18:04
> but conky itself has last updated at 18:00:18 ...
>
> This time I had some SSH sessions from another machine to see some more
> useful things. There was nothing in various logs under /var/log (I also
> can't run dmesg anymore ...)
> I had top(1) running in a loop, this is the last output:
>
> last pid: 25633; load averages: 0.27, 0.39, 0.36 up 1+23:03:28 18:00:12
> 202 processes: 2 running, 188 sleeping, 11 zombie, 1 waiting
>
> Mem: 8873M Active, 1783M Inact, 5072M Wired, 567M Buf, 132M Free
> ARC: 1844M Total, 469M MFU, 268M MRU, 16K Anon, 96M Header, 1012M Other
> Swap: 4096M Total, 2395M Used, 1701M Free, 58% Inuse
>
>
> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
> 11 root 8 155 ki31 0K 128K CPU0 0 364.6H 772.95% idle 3122 uqs 15 28 0 7113M 5861M uwait 0 94:44 13.96% chrome 2887 uqs 28 22 0 1394M 237M select 2 172:53 6.98% chrome 2890 uqs 11 21 0 1034M 178M select 5 231:21 1.95% chrome 1062 root 9 21 0 440M 47220K select 0 67:09 0.98% Xorg 3002 uqs 15 25 5 1159M 172M uwait 2 19:09 0.00% chrome
> 3139 uqs 17 25 5 1163M 156M uwait 2 16:15 0.00% chrome
> 3001 uqs 18 25 5 1639M 575M uwait 0 16:05 0.00% chrome
> 12 root 24 -64 - 0K 384K WAIT -1 10:53 0.00% intr
> 3129 uqs 12 20 0 2820M 1746M uwait 6 8:36 0.00% chrome
> 2822 uqs 9 20 0 217M 81300K select 0 5:10 0.00% conky
> 3174 root 1 20 0 21532K 3188K select 0 4:20 0.00% systat
> 3130 uqs 16 20 0 1058M 131M uwait 4 3:03 0.00% chrome
> 2998 uqs 16 20 0 1110M 123M uwait 2 2:53 0.00% chrome
> 3165 uqs 10 20 0 1209M 215M uwait 6 2:52 0.00% chrome
> 3142 uqs 11 25 5 1344M 195M uwait 2 2:46 0.00% chrome
> 2876 uqs 19 20 0 580M 37164K select 3 2:42 0.00% clementine-player
> 20 root 2 -16 - 0K 32K psleep 6 2:25 0.00% pagedaemon
>
> I also had systat -vm running and it continued to update its screen ...
> for a short while, this is the last update before SSH died:
>
>
> Mem usage: 0k%Phy 5%Kmem
> Mem: KB REAL VIRTUAL VN PAGER SWAP PAGER
> Tot Share Tot Share Free in out in out
> Act 11051k 67868 71051992 255448 61840 count
> All 11051k 67924 71058776 262100 pages
> Proc: Interrupts
> r p d s w Csw Trp Sys Int Sof Flt ioflt 224 total
> 25 730 11 724 109 404 101 13 cow 2 ehci0 16
> zfod 3 ehci1 23
> 0.0%Sys 0.1%Intr 0.0%User 0.0%Nice 99.9%Idle ozfod 16 cpu0:timer
> | | | | | | | | | | %ozfod xhci0 264
> daefr 3 em0 265
> 50 dtbuf prcfr 94 hdac1 266
> Namei Name-cache Dir-cache 349167 desvn totfr ahci0 270
> Calls hits % hits % 349155 numvn react 5 cpu1:timer
> 121 121 100 253501 frevn pdwak 1 cpu2:timer
> pdpgs 29 cpu7:timer
> Disks md0 ada0 ada1 pass0 pass1 pass2 intrn 12 cpu3:timer
> KB/t 0.00 0.00 0.00 0.00 0.00 0.00 5318892 wire 41 cpu6:timer
> tps 0 0 0 0 0 0 9261404 act 12 cpu5:timer
> MB/s 0.00 0.00 0.00 0.00 0.00 0.00 1598184 inact 6 cpu4:timer
> %busy 0 0 0 0 0 0 cache vgapci0
> 61840 free
> 712304 buf
>
>
> Why do I have a Chrome tab using about 6G? What other sort of debugging
> output can be helpful to get to the bottom of this? The machine still
> responds to pings just fine, TCP connections get set up but the SSH
> handshake never completes.
>
> This always happens between 30-50h and is super annoying and has been
> going on for >1year. Help?
>
> Note, I cut the power to the monitor overnight to save electricity, can
> this mess up something in the Radeon card or X server? What combinations
> would be most useful to try next?
>
Hi,
Sounds like a memory leak. Can you track the memory use over time?
Did you look at the output from:
vmstat -m ?
--HPS
More information about the freebsd-current
mailing list