svn commit: r327950 - in head/sys/powerpc: aim include powerpc ps3

Thu Jan 18 15:55:24 UTC 2018

On 01/18/2018 09:35, Konstantin Belousov wrote:
> On Thu, Jan 18, 2018 at 07:24:11AM -0800, Nathan Whitehorn wrote:
>>
>> On 01/17/18 01:44, Konstantin Belousov wrote:
>>> On Tue, Jan 16, 2018 at 09:30:29PM -0800, Nathan Whitehorn wrote:
>>>> On 01/16/18 11:32, Marius Strobl wrote:
>>>>> On Mon, Jan 15, 2018 at 03:20:49PM -0800, Nathan Whitehorn wrote:
>>>>>> On 01/15/18 09:53, Konstantin Belousov wrote:
>>>>>>> On Mon, Jan 15, 2018 at 09:32:56AM -0800, Nathan Whitehorn wrote:
>>>>>>>> That seems fine to me. I don't think a less-clumsy way that does not
>>>>>>>> involve extra indirection is possible. The PHYS_TO_DMAP() returning NULL
>>>>>>>> is about the best thing I can come up with from a clumsiness standpoint
>>>>>>>> since plenty of code checks for null pointers already, but doesn't
>>>>>>>> cleanly handle the rarer case where you want to test for the existence
>>>>>>>> of direct maps in general without testing some potemkin address.
>>>>>>>>
>>>>>>>> My one reservation about PMAP_HAS_DMAP or the like as a selector is that
>>>>>>>> it does not encode the full shape of the problem: one could imagine
>>>>>>>> having a direct map that only covers a limited range of RAM (I am not
>>>>>>>> sure whether the existence of dmaplimit on amd64 implies this can happen
>>>>>>>> with non-device memory in real life), for example. These cases are
>>>>>>>> currently covered by an assert() in PHYS_TO_DMAP(), whereas having
>>>>>>>> PHYS_TO_DMAP() return NULL allows a more flexible signalling and the
>>>>>>>> potential for the calling code to do something reasonable to handle the
>>>>>>>> error. A single global flag can't convey information at this kind of
>>>>>>>> granularity. Is this a reasonable concern? Or am I overthinking things?
>>>>>>> IMO it is overreaction.  amd64 assumes that all normal memory is covered
>>>>>>> by DMAP.  It must never fail.   See, for instance, the implementation
>>>>>>> of the sf bufs for it.
>>>>>>>
>>>>>>> If device memory not covered by DMAP can exists, it is the driver problem.
>>>>>>> For instance, for NVDIMMs I wrote specific mapping code which establishes
>>>>>>> kernel mapping for it, when not covered by EFI memory map and correspondingly
>>>>>>> not included into DMAP.
>>>>>>>
>>>>>> Fair enough. Here's a patch with a new flag (DIRECT_MAP_AVAILABLE). I've
>>>>>> also retooled the sfbuf code to use this rather than its own flags that
>>>>>> mean the same things. The sparc64 part of the patch is untested.
>>>>>> -Nathan
>>>>>> Index: sparc64/include/vmparam.h
>>>>>> ===================================================================
>>>>>> --- sparc64/include/vmparam.h	(revision 328006)
>>>>>> +++ sparc64/include/vmparam.h	(working copy)
>>>>>> @@ -240,10 +240,12 @@
>>>>>>     */
>>>>>>    #define	ZERO_REGION_SIZE	PAGE_SIZE
>>>>>>    
>>>>>> +#include <machine/tlb.h>
>>>>>> +
>>>>>>    #define	SFBUF
>>>>>>    #define	SFBUF_MAP
>>>>>> -#define	SFBUF_OPTIONAL_DIRECT_MAP	dcache_color_ignore
>>>>>> -#include <machine/tlb.h>
>>>>>> -#define	SFBUF_PHYS_DMAP(x)		TLB_PHYS_TO_DIRECT(x)
>>>>>>    
>>>>>> +#define DIRECT_MAP_AVAILABLE	dcache_color_ignore
>>>>>> +#define	PHYS_TO_DMAP(x)	(DIRECT_MAP_AVAILABLE ? (TLB_PHYS_TO_DIRECT(x) : 0)
>>>>> What dcache_color_ignore actually indicates is the presence of
>>>>> hardware unaliasing support, in other words the ability to enter
>>>>> duplicate cacheable mappings into the MMU. While a direct map is
>>>>> available and used by MD code on all supported CPUs down to US-I,
>>>>> the former feature is only implemented in the line of Fujitsu SPARC64
>>>>> processors. IIRC, the sfbuf(9) code can't guarantee that there isn't
>>>>> already a cacheable mapping from a different VA to the same PA,
>>>>> which is why it employs dcache_color_ignore. Is that a general
>>>>> constraint of all MI PHYS_TO_DMAP users or are there consumers
>>>>> which can guarantee that they are the only users of a mapping
>>>>> to the same PA?
>>>>>
>>>>> Marius
>>>>>
>>>> With the patch, there are four uses of this in the kernel: the sfbuf
>>>> code, a diagnostic check on page zeroing, part of the EFI runtime code,
>>>> and part of the Linux KBI compat. The second looks safe from this
>>>> perspective and at least some of the others (EFI runtime) are irrelevant
>>>> on sparc64. But I really have no idea what was intended for the
>>>> semantics of this API -- I didn't even know it *was* an MI API until
>>>> this commit. Maybe kib can comment? If this is outside the semantics of
>>>> PHYS_TO_DMAP, then we need to keep the existing sfbuf code.
>>> sfbufs cannot guarantee that there is no other mapping of the page when
>>> the sfbuf is created.  For instance, one of the use of sfbufs is to map
>>> the image page 0 to read ELF headers when doing the image activation.
>>> The image might be mapped by other processes, and we do not control the
>>> address at which it mapped.
>>>
>>> So the direct map accesses must work regardless of the presence of other
>>> page mappings, and the check for dcache_color_ignore is needed to allow
>>> MI code to take advantage of DMAP.
>>>
>> So: what do you want to happen with PHYS_TO_DMAP()? Do we want to claim 
>> to MI that a direct map is "available" in such circumstances, or 
>> "unavailable"? Should sfbuf retain a separate API? I have no preferences 
>> here and just want to close out this issue.
> Perhaps DMAP should be conditionally available to the MI layer, same as
> on powerpc ? I.e. your patch cited above looks right to me, unless I
> misunderstand the Marius' response.
>
Yes, it should.  Only the sparc64 machines where dcache_color_ignore is
true should attempt to use the direct map in MI code.  The
machine-dependent uiomove_fromphys() on sparc64 shows the hoops one has
to jump through in order to correctly use the direct map on sparc64.