Re: close() taking minutes (ZFS related)
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 20 Jan 2023 14:26:39 UTC
On Fri, Jan 20, 2023 at 12:05:20PM +0100, Mateusz Guzik wrote:
! On 1/20/23, Peter <pmc@citylink.dinoex.sub.org> wrote:
! >
! > Does this look familiar with anybody? ("truss -dD ls -l something")
! >
! > 0.047760369 0.000017548 open("/etc/nsswitch.conf",O_RDONLY|O_CLOEXEC,0666) =
! > 4 (0x4)
! > 0.047845099 0.000011290 fstat(4,{ mode=-rw-r--r--
! > ,inode=13886,size=272,blksize=4096 }) = 0 (0x0)
! > 0.047932166 0.000034182 read(4,"#\n# nsswitch.conf(5) - name ser"...,4096) =
! > 272 (0x110)
! > 0.048007308 0.000010661 read(4,0x801848000,4096) = 0 (0x0)
! > 45.827173802 45.779056767 close(4) = 0 (0x0)
! > ^^^^^^^^^^^^
! > ...
! > 45.866272643 0.000020608 open("/etc/localtime",O_RDONLY,077) = 6 (0x6)
! > 45.866398048 0.000017731 fstat(6,{ mode=-r--r--r--
! > ,inode=41572,size=2309,blksize=4096 }) = 0 (0x0)
! > 45.866505596 0.000054084 read(6,"TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0"...,41448)
! > = 2309 (0x905)
! > 49.511709215 3.645026276 close(6) = 0 (0x0)
! > ^^^^^^^^^^^
! >
! > (The longest I've seen was more than 5 minutes)
! >
! > There is otherwise nothing wrong with the machine, only I am running
! > a postgres table merge for some 10 mio. rows. And only during that the
! > effect happens, reproducibly.
! >
!
! This is most likely stalls happening on atime update.
I agree, that is the most plausible location.
! Most likely the following will help:
! sysctl vfs.zfs.per_txg_dirty_frees_percent=30
!
! see https://github.com/openzfs/zfs/issues/13932 for more information
Thank You, but sadly that doesn't change it.
I did now revert some file-locking and async in postgres to defaults.
And I run truss from UFS, because otherwise it would already hang:
root ~# truss -dD /j/admn/bin/ls -l /j/admn/etc/
And so I get things like this:
...
0.010328158 0.000009112 fstatat(AT_FDCWD,"newsyslog.conf",{ mode=-rw-r--r-- ,inode=13822,size=1708,blksize=4096 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)
0.010367043 0.000008568 fstatat(AT_FDCWD,"mail",{ mode=drwxr-xr-x ,inode=39,size=30,blksize=4096 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)
0.010405598 0.000008903 fstatat(AT_FDCWD,"apmd.conf",{ mode=-rw-r--r-- ,inode=44969,size=1240,blksize=4096 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)
16.746787754 16.736352005 fstatat(AT_FDCWD,"libmap.conf",{ mode=-rw-r--r-- ,inode=13980,size=47,blksize=4096 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)
16.746863876 0.000012441 fstatat(AT_FDCWD,"rc.conf.d",{ mode=drwxr-xr-x ,inode=51,size=2,blksize=4096 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)
16.746908593 0.000010568 fstatat(AT_FDCWD,"autofs",{ mode=drwxr-xr-x ,inode=31,size=9,blksize=4096 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)
16.746951173 0.000010509 fstatat(AT_FDCWD,"rc.initdiskless",{ mode=-rw-r--r-- ,inode=14262,size=13680,blksize=13824 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)
...
Is there an atime involved? (I tried to switch them of, but it seems
that requires a remount to become effective)
and 'ps' gives me the usual "zfs is busy waiting":
UID PID PPID C PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND
0 90738 90737 8 23 0 13396 3156 tx->tx_s DX+ 4 0:00.00 /j/admn/bin/ls -l /j/admn/etc/
# sysctl vfs.zfs.txg.timeout=1
vfs.zfs.txg.timeout: 15 -> 1
Doesn't change things noticeably.
Do You have an idea where to look further? (dtrace should be working,
but I don't know where to look)