Corrupted pmap pm_vlist - pmap_remove_pte()
    Alan Cox 
    alc at rice.edu
       
    Tue Apr 17 14:58:08 UTC 2012
    
    
  
On 4/17/2012 4:48 AM, Konstantin Belousov wrote:
> On Mon, Apr 16, 2012 at 03:08:25PM -0400, Ewart Tempest wrote:
>> In FreeBSD 6.*, we have been seeing crashes in pmap_remove_pages() that only seem to occur in scaling scenarios:
>>
>> 2564    #ifdef PMAP_REMOVE_PAGES_CURPROC_ONLY
>> 2565                    pte = vtopte(pv->pv_va);
>> 2566    #else
>> 2567                    pte = pmap_pte(pmap, pv->pv_va);
>> 2568    #endif
>> 2569                    tpte = *pte;<===================== page fault here
>>
>> The suspicion is that the pmap's pm_pvlist list is getting corrupted. To this end, I have a question on the following logic in pmap_remove_pte() (see in-line comment):
>>
>>     1533 static int
>>     1534 pmap_remove_pte(pmap_t pmap, pt_entry_t *ptq, vm_offset_t va, pd_entry_t ptepde)
>>     1535 {
>>     1536 	pt_entry_t oldpte;
>>     1537 	vm_page_t m;
>>     1538
>>     1539 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
>>     1540 	oldpte = pte_load_clear(ptq);
>>     1541 	if (oldpte&  PG_W)
>>     1542 		pmap->pm_stats.wired_count -= 1;
>>     1543 	/*
>>     1544 	 * Machines that don't support invlpg, also don't support
>>     1545 	 * PG_G.
>>     1546 	 */
>>     1547 	if (oldpte&  PG_G)
>>     1548 		pmap_invalidate_page(kernel_pmap, va);
>>     1549 	pmap->pm_stats.resident_count -= 1;
>>     1550 	if (oldpte&  PG_MANAGED) {
>>     1551 		m = PHYS_TO_VM_PAGE(oldpte&  PG_FRAME);
>>     1552 		if (oldpte&  PG_M) {
>>     1553 #if defined(PMAP_DIAGNOSTIC)
>>     1554 			if (pmap_nw_modified((pt_entry_t) oldpte)) {
>>     1555 				printf(
>>     1556 	"pmap_remove: modified page not writable: va: 0x%lx, pte: 0x%lx\n",
>>     1557 				    va, oldpte);
>>     1558 			}
>>     1559 #endif
>>     1560 			if (pmap_track_modified(va))
>>     1561 				vm_page_dirty(m);
>>     1562 		}
>>     1563 		if (oldpte&  PG_A)
>>     1564 			vm_page_flag_set(m, PG_REFERENCED);
>>     1565 		pmap_remove_entry(pmap, m, va);
>>     1566 	}
>>     1567 	return (pmap_unuse_pt(pmap, va, ptepde));<======= *** under what circumstances is it valid to free the page but not remove it from the pmap's pm_vlist? Even the code comment for pmap_unuse_pt() commences "After removing a page table entry ... ". ***
> It is valid to not remove pv_entry when no pv_entry exists for the mapping.
> The pv_entry is created if the page is managed, see pmap_enter() code.
> The block above the return is executed when the page is managed, or at
> least pmap thinks so.
>
> The HEAD code will panic in pmap_pvh_free() if pmap_phv_remove() cannot
> find the pv entry for given page and given pmap/va.
>
>>     1568 }
>>
>> If the tail end of the above function is changed as follows:
>>
>>     1565 		pmap_remove_entry(pmap, m, va);
>>     1565.5 	      return (pmap_unuse_pt(pmap, va, ptepde));
>>     1566 	}
>>     1567 	return (0);
>>
>> Then we don't see any crashes ... but is it the right thing to do?
> Should be not. Try to test this with some unmanaged mapping, like
> /dev/mem pages mapped into the exiting process address space.
>
> I am too new to know about any nuances of the RELENG_6 code.
The RELENG_6 code is doing essentially the same things as newer 
versions.   Crashes in this specific place are usually caused by DRAM 
errors.
Alan
    
    
More information about the freebsd-hackers
mailing list