Single-threaded bottleneck in geli
Jan Bramkamp
crest at rlwinm.de
Mon Jul 6 18:21:46 UTC 2020
On 03.07.20 21:30, Alan Somers wrote:
> I'm using geli, gmultipath, and ZFS on a large system, with hundreds of
> drives. What I'm seeing is that under at least some workloads, the overall
> performance is limited by the single geom kernel process. procstat and
> kgdb aren't much help in telling exactly why this process is using so much
> CPU, but it certainly must be related to the fact that over 15,000 IOPs are
> going through that thread. What can I do to improve this situation? Would
> it make sense to enable direct dispatch for geli? That would hurt
> single-threaded performance, but probably improve performance for highly
> multithreaded workloads like mine.
>
> Example top output:
> PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
> 13 root -8 - 0B 96K CPU46 46 82.7H 70.54%
> geom{g_down}
> 13 root -8 - 0B 96K - 9 35.5H 25.32%
> geom{g_up}
>
> -Alan
The problem isn't GELI. It's the problem is that gmultipath lacks direct
dispatch support. Last one and a half years ago I ran into the same
problem. Because I needed the performance I looked at what gmultipath
did and found now reason why it has run in the GEOM up and down threads.
So i patched in the flags claiming direct dispatch support. It improved
my read performance from 2.2GB/s to 3.4GB/s and write performance from
750MB/s to 1.5GB/s the system worked for a few days under high load
(saturated a 2 x 10Gb/s lagg(4) as read only WebDAV server and while
receiving uploads via SFTP). It worked until I attempted to shutdown the
system. It hung on shutdown an never powered off. I had to power cycle
the box via IPMI to recover. I never found the time to debug this problem.
More information about the freebsd-geom
mailing list