[PATCH] fix integer overflow in txg_delay()
    Martin Matuska 
    mm at FreeBSD.org
       
    Tue Aug  2 06:06:20 UTC 2011
    
    
  
Dňa 2. 8. 2011 7:33, Pawel Jakub Dawidek wrote / napísal(a):
> On Mon, Aug 01, 2011 at 12:35:33AM +0200, Martin Matuska wrote:
>> The txg_delay() function in txg.c uses the following initialization:
>> int timeout = ddi_get_lbolt() + ticks;
>>
>> Later, we have:
>>         while (ddi_get_lbolt() < timeout &&
>>             tx->tx_syncing_txg < txg-1 && !txg_stalled(dp))
>>                 (void) cv_timedwait(&tx->tx_quiesce_more_cv,
>> &tx->tx_sync_lock,
>>                     timeout - ddi_get_lbolt());
>>
>> The function txg_delay() is called from:
>> dsl_pool_tempreserve_space() and dsl_dir_tempreserve_space()
>>
>> In 24.855 days ddi_get_lbolt will be never smaller than timeout.
>>
>> Please review and/or comment the attached patch.
> Looks good to me. Can you elaborate a bit on consequences of such
> overflow? How the problem manifests itself?
>
> BTW. Is this something that affects IllumOS as well? If so, it would be
> nice to share with them.
>
>> Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c
>> ===================================================================
>> --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c	(revision 224527)
>> +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c	(working copy)
>> @@ -488,7 +488,7 @@
>>  txg_delay(dsl_pool_t *dp, uint64_t txg, int ticks)
>>  {
>>  	tx_state_t *tx = &dp->dp_tx;
>> -	int timeout = ddi_get_lbolt() + ticks;
>> +	clock_t timeout = ddi_get_lbolt() + ticks;
>>  
>>  	/* don't delay if this txg could transition to quiesing immediately */
>>  	if (tx->tx_open_txg > txg ||
The overflow causes txg_delay() not delaying txg threads anymore.
It is called from dsl_pool_tempreserve_space() in dsl_pool.c and from
dsl_dir_tempreserve_space() in dsl_dir.c from busy transaction groups:
dsl_pool.c:
/*
* If this transaction group is over 7/8ths capacity, delay
* the caller 1 clock tick. This will slow down the "fill"
* rate until the sync process can catch up with us.
*/
if (reserved && reserved > (write_limit - (write_limit >> 3)))
txg_delay(dp, tx->tx_txg, 1);
dsl_dir.c:
err = arc_tempreserve_space(lsize, tx->tx_txg);
if (err == 0) {
struct tempreserve *tr;
tr = kmem_zalloc(sizeof (struct tempreserve), KM_SLEEP);
tr->tr_size = lsize;
list_insert_tail(tr_list, tr);
err = dsl_pool_tempreserve_space(dd->dd_pool, asize, tx);
} else {
if (err == EAGAIN) {
txg_delay(dd->dd_pool, tx->tx_txg, 1);
err = ERESTART;
}
dsl_pool_memory_pressure(dd->dd_pool);
}
In my interpretation, this overflow will increase thread concurrency and
memory pressure on the pool and therefore slow down write speed.
I have filed this yesterday as Illumos bug #1313.
https://www.illumos.org/issues/1313
-- 
Martin Matuska
FreeBSD committer
http://blog.vx.sk
    
    
More information about the zfs-devel
mailing list