b_freelist TAILQ/SLIST
Alexander Motin
mav at FreeBSD.org
Fri Jun 28 08:57:19 UTC 2013
On 28.06.2013 09:57, Konstantin Belousov wrote:
> On Fri, Jun 28, 2013 at 12:26:44AM +0300, Alexander Motin wrote:
>> While doing some profiles of GEOM/CAM IOPS scalability, on some test
>> patterns I've noticed serious congestion with spinning on global
>> pbuf_mtx mutex inside getpbuf() and relpbuf(). Since that code is
>> already very simple, I've tried to optimize probably the only thing
>> possible there: switch bswlist from TAILQ to SLIST. As I can see,
>> b_freelist field of struct buf is really used as TAILQ in some other
>> places, so I've just added another SLIST_ENTRY field. And result
>> appeared to be surprising -- I can no longer reproduce the issue at all.
>> May be it was just unlucky synchronization of specific test, but I've
>> seen in on two different systems and rechecked results with/without
>> patch three times.
> This is too unbelievable.
I understand that it looks like a magic. I was very surprised to see
contention there at all, but `pmcstat -n 10000000 -TS unhalted-cycles`
shows it too often and repeatable:
PMC: [CPU_CLK_UNHALTED_CORE] Samples: 28052 (100.0%) , 12 unresolved
%SAMP IMAGE FUNCTION CALLERS
46.4 kernel __mtx_lock_sleep relpbuf:22.3 getpbuf:22.0
xpt_run_devq:0.8
13.3 kernel _mtx_lock_spin_cooki turnstile_trywait
4.3 kernel cpu_search_lowest cpu_search_lowest
2.3 kernel getpbuf physio
, and benchmark results confirm it.
> Could it be, e.g. some cache line conflicts
> which cause the trashing, in fact ? Does it help if you add void *b_pad
> before b_freelist instead of adding b_freeslist ?
No, this doesn't help. And previously I've tested it also with
b_freeslist in place but without other changes -- it didn't help either.
>> The present patch is here:
>> http://people.freebsd.org/~mav/buf_slist.patch
>>
>> The question is how to do it better? What is the KPI/KBI policy for
>> struct buf? I could replace b_freelist by a union and keep KBI, but
>> partially break KPI. Or I could add another field, probably breaking
>> KBI, but keeping KPI. Or I could do something handmade with no breakage.
>> Or this change is just a bad idea?
> The same question about using union for b_freelist/b_freeslist, does the
> effect of magically fixing the contention still there if b_freeslist
> is on the same offset as the b_freelist ?
Yes, it is.
> There are no K{B,P}I policy for struct buf in HEAD, just change it as
> it fits.
Which one would you prefer, the original or
http://people.freebsd.org/~mav/buf_slist2.patch ?
Thank you.
--
Alexander Motin
More information about the freebsd-hackers
mailing list