Corrupted pmap pm_vlist - pmap_remove_pte()

Ewart Tempest etempest at juniper.net
Mon Apr 16 19:12:08 UTC 2012


In FreeBSD 6.*, we have been seeing crashes in pmap_remove_pages() that only seem to occur in scaling scenarios:

2564    #ifdef PMAP_REMOVE_PAGES_CURPROC_ONLY
2565                    pte = vtopte(pv->pv_va);
2566    #else
2567                    pte = pmap_pte(pmap, pv->pv_va);
2568    #endif
2569                    tpte = *pte; <===================== page fault here

The suspicion is that the pmap's pm_pvlist list is getting corrupted. To this end, I have a question on the following logic in pmap_remove_pte() (see in-line comment):

   1533 static int
   1534 pmap_remove_pte(pmap_t pmap, pt_entry_t *ptq, vm_offset_t va, pd_entry_t ptepde)
   1535 {
   1536 	pt_entry_t oldpte;
   1537 	vm_page_t m;
   1538 
   1539 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
   1540 	oldpte = pte_load_clear(ptq);
   1541 	if (oldpte & PG_W)
   1542 		pmap->pm_stats.wired_count -= 1;
   1543 	/*
   1544 	 * Machines that don't support invlpg, also don't support
   1545 	 * PG_G.
   1546 	 */
   1547 	if (oldpte & PG_G)
   1548 		pmap_invalidate_page(kernel_pmap, va);
   1549 	pmap->pm_stats.resident_count -= 1;
   1550 	if (oldpte & PG_MANAGED) {
   1551 		m = PHYS_TO_VM_PAGE(oldpte & PG_FRAME);
   1552 		if (oldpte & PG_M) {
   1553 #if defined(PMAP_DIAGNOSTIC)
   1554 			if (pmap_nw_modified((pt_entry_t) oldpte)) {
   1555 				printf(
   1556 	"pmap_remove: modified page not writable: va: 0x%lx, pte: 0x%lx\n",
   1557 				    va, oldpte);
   1558 			}
   1559 #endif
   1560 			if (pmap_track_modified(va))
   1561 				vm_page_dirty(m);
   1562 		}
   1563 		if (oldpte & PG_A)
   1564 			vm_page_flag_set(m, PG_REFERENCED);
   1565 		pmap_remove_entry(pmap, m, va);
   1566 	}
   1567 	return (pmap_unuse_pt(pmap, va, ptepde)); <======= *** under what circumstances is it valid to free the page but not remove it from the pmap's pm_vlist? Even the code comment for pmap_unuse_pt() commences "After removing a page table entry ... ". ***
   1568 }

If the tail end of the above function is changed as follows:

   1565 		pmap_remove_entry(pmap, m, va);
   1565.5 	      return (pmap_unuse_pt(pmap, va, ptepde));
   1566 	}
   1567 	return (0);

Then we don't see any crashes ... but is it the right thing to do?

Ewart


More information about the freebsd-hackers mailing list