svn commit: r273598 - in head: include sys/dev/acpica

Konstantin Belousov kostikbel at gmail.com
Fri Oct 24 19:45:52 UTC 2014


On Fri, Oct 24, 2014 at 07:33:07PM +0000, Rui Paulo wrote:
> On Oct 24, 2014, at 12:20 PM, Konstantin Belousov <kostikbel at gmail.com> wrote:
> 
> On Fri, Oct 24, 2014 at 06:39:16PM +0000, Rui Paulo wrote:
> > Author: rpaulo
> > Date: Fri Oct 24 18:39:15 2014
> > New Revision: 273598
> > URL: https://svnweb.freebsd.org/changeset/base/273598
> > 
> > Log:
> > HPET: create /dev/hpetN as a way to access HPET from userland.
> > 
> > In some cases, TSC is broken and special applications might benefit
> > from memory mapping HPET and reading the registers to count time.
> > Most often the main HPET counter is 32-bit only[1], so this only gives
> > the application a 300 second window based on the default HPET
> > interval.
> > Other applications, such as Intel's DPDK, expect /dev/hpet to be
> > present and use it to count time as well.
> > 
> > Although we have an almost userland version of gettimeofday() which
> > uses rdtsc in userland, it's not always possible to use it, depending
> > on how broken the multi-socket hardware is.
> Yes, and hpet userland mapping would be better handled through the same
> fake-vdso framework. As designed, it has discriminator to inform
> userspace about algorithm, and can happilly utilize HPET timecounter
> automatically mapped by kernel into the process address space.
>  
> I'm aware of that, but I found the vdso a bit confusing and decided to work on that later.
> 
> > +static int
> > +hpet_open(struct cdev *cdev, int oflags, int devtype, struct thread *td)
> > +{
> > +      struct hpet_softc *sc;
> > +
> > +        sc = cdev->si_drv1;
> > +        if (!sc->mmap_allow)
> > +          return (EPERM);
> > +      if (atomic_cmpset_32(&sc->devinuse, 0, 1) == 0)
> > +            return (EBUSY);
> This is extra-weird.
> The devinuse business disallows simultaneous opens, which prevents
> other process from opening and mapping. But if the original caller
> does mmap and close, second process now is allowed to open and mmap.
> 
> That said, why do you need this devinuse at all ?
>  
> Hmm, I wanted to avoid multiple mmap's, but that doesn't work like you said.  I may just remove this restriction. 
> 
This is probably best.

> > +static int
> > +hpet_mmap(struct cdev *cdev, vm_ooffset_t offset, vm_paddr_t *paddr,
> > + int nprot, vm_memattr_t *memattr)
> > +{
> > +    struct hpet_softc *sc;
> > +
> > +    sc = cdev->si_drv1;
> > +        if (offset > rman_get_size(sc->mem_res))
> > +                return (EINVAL);
> > +    if (!sc->mmap_allow_write && (nprot & PROT_WRITE))
> > +      return (EPERM);
> > +      *paddr = rman_get_start(sc->mem_res) + offset;
> What is the memattr for the backing page ? Is it set to non-cached
> mode somehow ? I was not able to find place where would this happen.
>  
> I expect it to be set to non-cached since it's a device anyway, but I don't know where it is.  During my testing, I did not see any problems with cached values, though.
> 
I am not claiming that it is wrong, only that I do not see an easy reason
why it is right.  Just printing the *memattr would provide the confidence.

> > +  sc->pdev = make_dev(&hpet_cdevsw, 0, UID_ROOT, GID_WHEEL,
> > +         0600, "hpet%d", device_get_unit(dev));
> > +        if (sc->pdev) {
> > +                sc->pdev->si_drv1 = sc;
> > +      sc->mmap_allow = 1;
> > +                TUNABLE_INT_FETCH("hw.acpi.hpet.mmap_allow",
> > +             &sc->mmap_allow);
> > +            sc->mmap_allow_write = 1;
> > +            TUNABLE_INT_FETCH("hw.acpi.hpet.mmap_allow_write",
> > +                 &sc->mmap_allow_write);
> > +                SYSCTL_ADD_INT(device_get_sysctl_ctx(dev),
> > +                 SYSCTL_CHILDREN(device_get_sysctl_tree(dev)),
> > +       OID_AUTO, "mmap_allow",
> > +           CTLFLAG_RW, &sc->mmap_allow, 0,
> > +                 "Allow userland to memory map HPET");
> Why is mmap_allow is per-instance, while mmap_allow_write taken from
> the global tunable ?
>  
> Are you asking why there's no sysctl for it?

IMO the allow-write must be controllable per-instance, or just managed
by /dev/hpet* unix permissions.  Having one global knob, which is
consulted at the module load, is not flexible.

What is the use-case for writing to HPET page ? To manually micro-adjust
the timer ?  Then, you probably want to disable write for instance used
by system timecounter or eventtimer, but allow for HPET utilized by
other consumers.


More information about the svn-src-head mailing list