Machine stops for some seconds with ZFS
Thomas Burgess
wonslung at gmail.com
Wed Feb 3 10:38:47 UTC 2010
why would you use a usb drive for L2ARC?
I would think that would make things slower...have you tried setting up
without the usb drive?
On Wed, Feb 3, 2010 at 4:48 AM, Attila Nagy <bra at fsn.hu> wrote:
> Hello,
>
> After a long time, I've switched back to ZFS on my desktop. It runs
> 8-STABLE/amd64 with two SATA disks and an USB pendrive.
> One-one partition is used from each disk for the zpool, which is encrypted
> using GELI, and the pendrive is there for L2ARC:
> NAME STATE READ WRITE CKSUM
> data ONLINE 0 0 0
> mirror ONLINE 0 0 0
> ad0s1d.eli ONLINE 0 0 0
> ad1s1d.eli ONLINE 0 0 0
> cache
> da0 ONLINE 0 0 0
>
> Today, after 12 days of uptime the machine has frozen. I could ping it from
> a different machine, even could open a telnet to its ssh port, but I
> couldn't get the ssh banner.
>
> Now I'm building a 9-CURRENT kernel and world to see whether the same
> problem persists with that, and during the make process I've noticed a
> strange thing.
> I build with -j4 (the machine has one dual core CPU), so the fans are
> screaming during the process. But every few minutes (I couldn't recognize
> any patterns in it) the machine goes completely silent (even more silent
> than normally), and everything halts.
> During this, the top running on the machine can refresh itself, and I can
> type on pass through ssh connections (that is, I use the machine in question
> to access other machines with ssh), but I can't open new ssh connections to
> it, and can't start anything new (for example from an open shell).
> ping is running seamlessly during this, and top shows the following:
>
> last pid: 36503; load averages: 1.59, 3.04, 3.01 up 0+00:49:53
> 10:32:10
> 97 processes: 1 running, 96 sleeping
> CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
> Mem: 218M Active, 24M Inact, 639M Wired, 40M Cache, 6208K Buf, 1022M Free
> Swap: 4096M Total, 4096M Free
>
> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
> 1342 root 1 44 0 3204K 620K select 0 0:02 0.00% make
> 1424 root 1 44 0 3204K 1036K select 0 0:01 0.00% make
> 1280 root 1 44 0 12540K 1900K select 0 0:01 0.00%
> hald-addon-storage
> 1234 haldaemon 1 44 0 24116K 4464K select 0 0:01 0.00% hald
> 93600 root 1 44 0 3204K 1028K select 0 0:00 0.00% make
> 1260 root 1 44 0 19704K 2688K select 0 0:00 0.00%
> hald-addon-mouse-sy
> 15142 bra 1 44 0 9332K 2864K CPU0 0 0:00 0.00% top
> 1263 root 1 44 0 12540K 1896K cgticb 0 0:00 0.00%
> hald-addon-storage
> 94415 bra 1 44 0 37944K 4992K select 1 0:00 0.00% sshd
> 35837 root 1 44 0 5252K 2424K select 1 0:00 0.00% make
> 95361 bra 1 44 0 37944K 4992K select 1 0:00 0.00% sshd
> 35973 root 1 44 0 3204K 1772K select 0 0:00 0.00% make
> 608 root 1 44 0 6892K 1436K select 1 0:00 0.00% syslogd
> 96928 root 1 44 0 3204K 728K select 0 0:00 0.00% make
> 94369 root 1 51 0 37944K 4584K sbwait 0 0:00 0.00% sshd
> 82631 root 1 50 0 37944K 4584K sbwait 0 0:00 0.00% sshd
> 16304 root 1 44 0 37944K 4576K zio->i 1 0:00 0.00% sshd
> 951 _ntp 1 44 0 6876K 1692K select 0 0:00 0.00% ntpd
> 1238 root 1 76 0 16768K 2372K select 0 0:00 0.00%
> hald-runner
> 4916 root 1 44 0 3204K 728K select 1 0:00 0.00% make
> 95338 root 1 49 0 37944K 4584K sbwait 1 0:00 0.00% sshd
> 1259 root 1 44 0 10280K 2712K pause 1 0:00 0.00% csh
> 33357 bra 1 44 0 21596K 4004K select 0 0:00 0.00% ssh
> 16405 bra 1 44 0 37944K 5012K zio->i 0 0:00 0.00% sshd
> 1044 root 1 44 0 9104K 1796K kqread 0 0:00 0.00% master
> 34765 root 1 76 0 8260K 1764K wait 1 0:00 0.00% sh
> 82685 bra 1 44 0 37944K 4960K select 1 0:00 0.00% sshd
> 1065 postfix 1 44 0 9100K 1872K kqread 0 0:00 0.00% qmgr
> 1237 root 17 44 0 27460K 4124K waitvt 0 0:00 0.00%
> console-kit-daemon
> 95362 bra 1 44 0 10216K 2612K ttyin 0 0:00 0.00% bash
> 34764 root 1 44 0 3204K 852K select 0 0:00 0.00% make
> 1222 root 1 49 0 21672K 1896K wait 0 0:00 0.00% login
> 35728 root 1 44 0 3204K 860K select 0 0:00 0.00% make
> 1064 postfix 1 44 0 9104K 1772K zio->i 1 0:00 0.00% pickup
> 82696 bra 1 44 0 10216K 2596K wait 0 0:00 0.00% bash
> 94417 bra 1 44 0 10216K 2596K wait 1 0:00 0.00% bash
> 35455 root 1 44 0 3204K 744K select 0 0:00 0.00% make
> 35774 root 1 44 0 3204K 728K select 1 0:00 0.00% make
> 16409 bra 1 44 0 10216K 2592K ttyin 0 0:00 0.00% bash
> 1155 root 1 44 0 7948K 1604K nanslp 0 0:00 0.00% cron
> 1077 messagebus 1 53 0 8092K 2060K select 0 0:00 0.00%
> dbus-daemon
> 1149 root 1 44 0 26012K 3960K select 1 0:00 0.00% sshd
> 35729 root 1 76 0 8260K 1760K wait 0 0:00 0.00% sh
> 4921 root 1 57 0 8260K 1748K wait 0 0:00 0.00% sh
> 825 root 1 76 0 39212K 2372K lockf 1 0:00 0.00%
> saslauthd
> 35460 root 1 76 0 8260K 1748K wait 0 0:00 0.00% sh
> 34761 root 1 48 0 8260K 1740K wait 1 0:00 0.00% sh
> 96923 root 1 50 0 8260K 1740K wait 0 0:00 0.00% sh
>
>
> As you can see, top reports that the machine is 100% idle, while a make -j4
> buildworld runs. This lasts for few seconds (10-20), then everything goes
> back to normal, the fans start to scream, the build continues and I can use
> the machine.
> This occasional halt is new to me -but I'm just switched to ZFS on my
> desktop, in a server it's harder to notice if you don't use it for
> interactive sessions-, but I could see the final freeze on more than one
> servers.
> How could I help to debug this, and the final one?
>
> Thanks,
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
>
More information about the freebsd-fs
mailing list