git: f4c6843ec2b9 - main - xen: use correct cache attributes for Xen specific memory regions

Thu Aug 12 21:46:26 UTC 2021

On 8/12/21 10:01 AM, Roger Pau Monné wrote:
> On Thu, Aug 12, 2021 at 09:43:57AM -0700, John Baldwin wrote:
>> On 8/12/21 8:53 AM, Roger Pau Monné wrote:
>>> On Thu, Aug 12, 2021 at 08:20:51AM -0700, John Baldwin wrote:
>>>> On 8/12/21 12:24 AM, Roger Pau Monné wrote:
>>>>> The branch main has been updated by royger:
>>>>>
>>>>> URL: https://cgit.FreeBSD.org/src/commit/?id=f4c6843ec2b9aa5eff475778fb000ed6278c5b77
>>>>>
>>>>> commit f4c6843ec2b9aa5eff475778fb000ed6278c5b77
>>>>> Author:     Roger Pau Monné <royger at FreeBSD.org>
>>>>> AuthorDate: 2021-04-09 09:31:44 +0000
>>>>> Commit:     Roger Pau Monné <royger at FreeBSD.org>
>>>>> CommitDate: 2021-08-12 07:18:32 +0000
>>>>>
>>>>>        xen: use correct cache attributes for Xen specific memory regions
>>>>>        bus_activate_resource maps memory regions as uncacheable on x86, which
>>>>>        is more strict than required for regions allocated using xenmem_alloc,
>>>>>        so don't rely on bus_activate_resource and instead map the region
>>>>>        using pmap_mapdev_attr and VM_MEMATTR_XEN as the cache attribute.
>>>>>        Sponsored by: Citrix Systems R&D
>>>>
>>>> It would probably be cleaner to use bus_map_resource() for this instead.  It
>>>> would mean you would have to use a structure that writes to as the argument
>>>> to bus_read/write_* instead of using the resource directly.
>>>
>>> Those regions are usually handled to other subsystems of the kernel.
>>> They are mostly used to map memory from other domains and then perform
>>> IO on their behalf (like blkback and netback do), so it's not really
>>> possible to assert that all users of the regions would use
>>> bus_read/write_* to access them.
>>>
>>> I could however switch to using bus_map_resource if I can pass the
>>> desired memory cache attribute and get a linear address back. It looks
>>> like resource_map_request parameter allows to select the cache
>>> attribute.
>>
>> Yes, one of the use case for bus_map_resource is to permit passing a
>> non-default memory attribute.  (It also permits mapping sub-ranges, but
>> that isn't applicable in this case.)  The 'r_vaddr' in the resource_map
>> is the equivalent of rman_get_addr.  It'd be a little cleaner (and
>> perhaps friendly to future changes in this area) if your clients used
>> the 'r_vaddr' from the resource_map directly instead of using
>> rman_set_virtual() on the original resource.  (At some point I would like
>> to have each 'struct resource' keep a list of all the mappings created
>> from it and to forcefully invalidate/free any existing mappings that
>> still remain when a 'struct resource' is freed.  I imagine as part of
>> that I might end up with some assertions about the embedded resource
>> fields in the 'struct resource' being self-consistent).
> 
> Does the following diff seem better:
> 
> diff --git a/sys/dev/xen/bus/xenpv.c b/sys/dev/xen/bus/xenpv.c
> index 91004039a85e..748ead7f2f41 100644
> --- a/sys/dev/xen/bus/xenpv.c
> +++ b/sys/dev/xen/bus/xenpv.c
> @@ -120,7 +120,8 @@ xenpv_alloc_physmem(device_t dev, device_t child, int *res_id, size_t size)
>   {
>   	struct resource *res;
>   	vm_paddr_t phys_addr;
> -	void *virt_addr;
> +	struct resource_map_request margs;
> +	struct resource_map map;
>   	int error;
>   
>   	res = bus_alloc_resource(child, SYS_RES_MEMORY, res_id, LOW_MEM_LIMIT,
> @@ -135,9 +136,16 @@ xenpv_alloc_physmem(device_t dev, device_t child, int *res_id, size_t size)
>   		bus_release_resource(child, SYS_RES_MEMORY, *res_id, res);
>   		return (NULL);
>   	}
> -	virt_addr = pmap_mapdev_attr(phys_addr, size, VM_MEMATTR_XEN);
> -	KASSERT(virt_addr != NULL, ("Failed to create linear mappings"));
> -	rman_set_virtual(res, virt_addr);
> +
> +	resource_init_map_request(&margs);
> +	margs.memattr = VM_MEMATTR_XEN;
> +	error = bus_map_resource(child, SYS_RES_MEMORY, res, &margs, &map);
> +	if (error != 0) {
> +		vm_phys_fictitious_unreg_range(phys_addr, phys_addr + size);
> +		bus_release_resource(child, SYS_RES_MEMORY, *res_id, res);
> +		return (NULL);
> +	}
> +	rman_set_mapping(res, &map);
>   
>   	return (res);
>   }
> @@ -146,14 +154,14 @@ static int
>   xenpv_free_physmem(device_t dev, device_t child, int res_id, struct resource *res)
>   {
>   	vm_paddr_t phys_addr;
> -	vm_offset_t virt_addr;
> +	struct resource_map map;
>   	size_t size;
>   
>   	phys_addr = rman_get_start(res);
>   	size = rman_get_size(res);
> -	virt_addr = (vm_offset_t)rman_get_virtual(res);
> +	rman_get_mapping(res, &map);
>   
> -	pmap_unmapdev(virt_addr, size);
> +	bus_unmap_resource(child, SYS_RES_MEMORY, res, &map);
>   	vm_phys_fictitious_unreg_range(phys_addr, phys_addr + size);
>   	return (bus_release_resource(child, SYS_RES_MEMORY, res_id, res));
>   }
> 
> I also wonder if rman_set_mapping should clear the RF_UNMAPPED flag
> from the resource? Or the content of the flags field is opaque from
> rman PoV and it should be the caller of rman_set_mapping the one to
> clear the flag? (xenpv_alloc_physmem in the code above)
> 
> It's kind of pointless to copy the bus_unmap_resource logic in
> xenpv_free_physmem when it could be done by bus_release_resource if
> the RF_UNMAPPED flag was cleared.

Huh, I hadn't really imagined that use case, but it would be nice to
not have to touch the release path and optimize this case when you
aren't doing a subset mapping.  I would be fine with changing
rman_set_mapping to clear the flag.  In fact, arguably it would be
cleaner (but maybe just one more one-line change?) to have new
resources always have RF_UNMAPPED set even if it wasn't passed into
the rman_reserve_bounds() (you would do it in the case where you
allocate a resource before the RF_SHAREABLE loop) and then only
clear it if rman_set_mapping() is called.

-- 
John Baldwin