svn commit: r273598 - in head: include sys/dev/acpica
Konstantin Belousov
kostikbel at gmail.com
Fri Oct 24 19:45:52 UTC 2014
On Fri, Oct 24, 2014 at 07:33:07PM +0000, Rui Paulo wrote:
> On Oct 24, 2014, at 12:20 PM, Konstantin Belousov <kostikbel at gmail.com> wrote:
>
> On Fri, Oct 24, 2014 at 06:39:16PM +0000, Rui Paulo wrote:
> > Author: rpaulo
> > Date: Fri Oct 24 18:39:15 2014
> > New Revision: 273598
> > URL: https://svnweb.freebsd.org/changeset/base/273598
> >
> > Log:
> > HPET: create /dev/hpetN as a way to access HPET from userland.
> >
> > In some cases, TSC is broken and special applications might benefit
> > from memory mapping HPET and reading the registers to count time.
> > Most often the main HPET counter is 32-bit only[1], so this only gives
> > the application a 300 second window based on the default HPET
> > interval.
> > Other applications, such as Intel's DPDK, expect /dev/hpet to be
> > present and use it to count time as well.
> >
> > Although we have an almost userland version of gettimeofday() which
> > uses rdtsc in userland, it's not always possible to use it, depending
> > on how broken the multi-socket hardware is.
> Yes, and hpet userland mapping would be better handled through the same
> fake-vdso framework. As designed, it has discriminator to inform
> userspace about algorithm, and can happilly utilize HPET timecounter
> automatically mapped by kernel into the process address space.
>
> I'm aware of that, but I found the vdso a bit confusing and decided to work on that later.
>
> > +static int
> > +hpet_open(struct cdev *cdev, int oflags, int devtype, struct thread *td)
> > +{
> > + struct hpet_softc *sc;
> > +
> > + sc = cdev->si_drv1;
> > + if (!sc->mmap_allow)
> > + return (EPERM);
> > + if (atomic_cmpset_32(&sc->devinuse, 0, 1) == 0)
> > + return (EBUSY);
> This is extra-weird.
> The devinuse business disallows simultaneous opens, which prevents
> other process from opening and mapping. But if the original caller
> does mmap and close, second process now is allowed to open and mmap.
>
> That said, why do you need this devinuse at all ?
>
> Hmm, I wanted to avoid multiple mmap's, but that doesn't work like you said. I may just remove this restriction.
>
This is probably best.
> > +static int
> > +hpet_mmap(struct cdev *cdev, vm_ooffset_t offset, vm_paddr_t *paddr,
> > + int nprot, vm_memattr_t *memattr)
> > +{
> > + struct hpet_softc *sc;
> > +
> > + sc = cdev->si_drv1;
> > + if (offset > rman_get_size(sc->mem_res))
> > + return (EINVAL);
> > + if (!sc->mmap_allow_write && (nprot & PROT_WRITE))
> > + return (EPERM);
> > + *paddr = rman_get_start(sc->mem_res) + offset;
> What is the memattr for the backing page ? Is it set to non-cached
> mode somehow ? I was not able to find place where would this happen.
>
> I expect it to be set to non-cached since it's a device anyway, but I don't know where it is. During my testing, I did not see any problems with cached values, though.
>
I am not claiming that it is wrong, only that I do not see an easy reason
why it is right. Just printing the *memattr would provide the confidence.
> > + sc->pdev = make_dev(&hpet_cdevsw, 0, UID_ROOT, GID_WHEEL,
> > + 0600, "hpet%d", device_get_unit(dev));
> > + if (sc->pdev) {
> > + sc->pdev->si_drv1 = sc;
> > + sc->mmap_allow = 1;
> > + TUNABLE_INT_FETCH("hw.acpi.hpet.mmap_allow",
> > + &sc->mmap_allow);
> > + sc->mmap_allow_write = 1;
> > + TUNABLE_INT_FETCH("hw.acpi.hpet.mmap_allow_write",
> > + &sc->mmap_allow_write);
> > + SYSCTL_ADD_INT(device_get_sysctl_ctx(dev),
> > + SYSCTL_CHILDREN(device_get_sysctl_tree(dev)),
> > + OID_AUTO, "mmap_allow",
> > + CTLFLAG_RW, &sc->mmap_allow, 0,
> > + "Allow userland to memory map HPET");
> Why is mmap_allow is per-instance, while mmap_allow_write taken from
> the global tunable ?
>
> Are you asking why there's no sysctl for it?
IMO the allow-write must be controllable per-instance, or just managed
by /dev/hpet* unix permissions. Having one global knob, which is
consulted at the module load, is not flexible.
What is the use-case for writing to HPET page ? To manually micro-adjust
the timer ? Then, you probably want to disable write for instance used
by system timecounter or eventtimer, but allow for HPET utilized by
other consumers.
More information about the svn-src-head
mailing list