rdr pass for proto tcp sometimes creates states with expire time zero and so breaking connections
Andreas Longwitz
longwitz at incore.de
Thu Jan 24 22:09:41 UTC 2019
>> I think the problem is the cmpxchg8b instruction used in
>> counter_u64_fetch(), because this machine instruction always writes to
>> memory, also when we only want to read and have (EDX:EAX) = (ECX:EBX):
>>
>> TEMP64 <- DEST
>> IF (EDX:EAX = TEMP64)
>> THEN
>> ZF <- 1
>> DEST <- ECX:EBX
>> ELSE
>> ZF <- 0
>> EDX:EAX <- TEMP64
>> DEST <- TEMP64
>> FI
>>
>> If one CPU increments the counter in pf_create_state() and another does
>> the fetch, then both CPU's may run the xmpxschg8b at once with the
>> chance that both read the same memory value in TEMP64 and the fetching
>> CPU is the second CPU that writes and so the increment is lossed. Thats
>> what I see without the above patch two or three times a week.
>
> Please try the following patch. The idea is to make the value to compare
> with unlikely to be equal to the memory content, for fetch_one().
During my research I first had the same idea, but it did not work. In
the actual coding eax/edx is not well defined before cmpxchg8b is
executed, but it does not help for the problem to do so.
> Also it fixes a silly bug in zero_one().
>
> diff --git a/sys/i386/include/counter.h b/sys/i386/include/counter.h
> index 7fd26d2a960..aa20831ba18 100644
> --- a/sys/i386/include/counter.h
> +++ b/sys/i386/include/counter.h
> @@ -78,6 +78,9 @@ counter_u64_read_one_8b(uint64_t *p)
> uint32_t res_lo, res_high;
>
> __asm __volatile(
> + "movl (%0),%%eax\n\t"
> + "movl 4(%0),%%edx\n\t"
> + "addl $0x10000000,%%edx\n\t" /* avoid write */
> "movl %%eax,%%ebx\n\t"
> "movl %%edx,%%ecx\n\t"
> "cmpxchg8b (%2)"
We can not avoid the write done by cmpxchg8b as can be seen from the
microcode given above, we always end up with "DEST <- TEMP". From the
Intel instruction reference manual:
The destination operand is written back if the comparision fails. (The
processor never produces a locked read without also producing a locked
write).
Maybe it is enough to prefix the cmpxchg8b with LOCK only in function
counter_u64_read_one_8b().
> @@ -120,11 +123,11 @@ counter_u64_zero_one_8b(uint64_t *p)
> {
> __asm __volatile(
> +"\n1:\n\t"
> "movl (%0),%%eax\n\t"
> - "movl 4(%0),%%edx\n"
> + "movl 4(%0),%%edx\n\t"
> "xorl %%ebx,%%ebx\n\t"
> "xorl %%ecx,%%ecx\n\t"
> -"1:\n\t"
> "cmpxchg8b (%0)\n\t"
> "jnz 1b"
> :
If jnz jumps back the instruction cmpxchg8b has load registers eax/edx
with (%0), therefor I do not understand the silly bug.
Andreas
More information about the freebsd-pf
mailing list