[Bug 240047] more and more processes get stuck waiting for ufs and zfs until system is rendered inaccessible

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Thu Aug 22 21:41:28 UTC 2019


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240047

            Bug ID: 240047
           Summary: more and more processes get stuck waiting for ufs and
                    zfs until system is rendered inaccessible
           Product: Base System
           Version: 12.0-RELEASE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs at FreeBSD.org
          Reporter: fuz at fuz.su

I'm on a conference running an open FTP server.  Files are served by FTP via
ftpd(8), NFS via nfsd(8), and HTTP via Apache 2.4.  The server has its root on
UFS and remaining files spread over three ZFS pools, one currently replacing a
(working) disk:

$ zpool list -v
NAME                                     SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ 
 FRAG    CAP  DEDUP  HEALTH  ALTROOT
disk12                                  18.1T  14.0T  4.16T        -         - 
   3%    77%  1.00x  ONLINE  -
  da3                                   9.06T  6.98T  2.08T        -         - 
   3%    77%
  diskid/DISK-7JG9E40C%20%20%20%20%20%20%20%20%20%20%20%20  9.06T  6.98T  2.08T
       -         -     3%    77%
cache                                       -      -      -         -      -   
  -
  ada0p2                                 170G  3.98G   166G        -         - 
   0%     2%
disk34                                  18.1T  14.8T  3.33T        -         - 
   4%    81%  1.00x  ONLINE  -
  da2                                   9.06T  7.39T  1.67T        -         - 
   4%    81%
  da1                                   9.06T  7.41T  1.66T        -         - 
   4%    81%
cache                                       -      -      -         -      -   
  -
  ada0p5                                 170G  5.14G   165G        -         - 
   0%     3%
disk56                                  18.1T  14.0T  4.15T        -         - 
   1%    77%  1.00x  ONLINE  -
  replacing                             9.06T  6.97T  2.10T        -         - 
   1%    76%
    da0                                     -      -      -        -         - 
    -      -
    da4                                     -      -      -        -         - 
    -      -
  diskid/DISK-7PGVBGZC%20%20%20%20%20%20%20%20%20%20%20%20  9.06T  7.01T  2.06T
       -         -     1%    77%
cache                                       -      -      -         -      -   
  -
  ada0p6                                 170G  6.03G   164G        -         - 
   0%     3%

$ df -h
Filesystem         Size    Used   Avail Capacity  Mounted on
/dev/ada0p4        375G     68G    278G    20%    /
devfs              1.0K    1.0K      0B   100%    /dev
tmpfs               33G     76K     33G     0%    /var/run
tmpfs               33G    4.0K     33G     0%    /tmp
tmpfs               33G    156K     33G     0%    /var/log
fdescfs            1.0K    1.0K      0B   100%    /dev/fd
procfs             4.0K    4.0K      0B   100%    /proc
disk12              18T     14T    3.6T    80%    /disk12
disk34              17T     14T    2.8T    83%    /disk34
disk56              18T     14T    3.6T    80%    /disk56
disk34/zeug        3.6T    864G    2.8T    23%    /usr/home/fuz/zeug
<above>:/disk12     18T     14T    3.6T    80%    /export
<above>:/disk34     35T     32T    2.8T    92%    /export
<above>:/disk56     52T     49T    3.6T    93%    /export

Files are served over a 10 GBe connection with an average bandwith of around
200 MB/s, the limit seems to be in the number of IOP/s:

$ zpool iostat
               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
disk12      14.0T  4.16T    254      0  34.8M  6.16K
disk34      14.8T  3.33T    261     29  35.0M  1.20M
disk56      14.0T  4.15T    882     29   118M   191K
----------  -----  -----  -----  -----  -----  -----

RAM is about half used and nothing seems to indicate any resource exhaustion.

$ vmstat
procs  memory       page                    disks     faults         cpu
r b w  avm   fre   flt  re  pi  po    fr   sr ad0 da0   in    sy    cs us sy id
0 0 0 1.0T  666M   451 1197 436   0 64834 14532   0   0 28631 18084 93822  0 17
83

The only sysctl set is kern.racct.enable=1



After a while, more and more httpd and ftpd processes get stuck in an ufs or
zfs wait state.  They cannot be killed.  I have since rebooted the server a
bunch of times and the problem keeps appearing.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list