svn commit: r334595 - in head: sys/dev/hwpmc sys/kern sys/sys usr.sbin/pmcstat

Mon Jun 4 17:27:23 UTC 2018

On Mon, Jun 4, 2018 at 5:08 AM, Konstantin Belousov <kostikbel at gmail.com> wrote:
> On Mon, Jun 04, 2018 at 01:10:23AM +0000, Matt Macy wrote:
>> @@ -2214,6 +2236,11 @@ pmc_hook_handler(struct thread *td, int function, void
>>
>>               pmc_capture_user_callchain(PCPU_GET(cpuid), PMC_HR,
>>                   (struct trapframe *) arg);
>> +
>> +             KASSERT(td->td_pinned == 1,
>> +                     ("[pmc,%d] invalid td_pinned value", __LINE__));
>> +             sched_unpin();  /* Can migrate safely now. */
> sched_pin() is called from pmc_post_callchain_callback(), which is
> called from userret(). userret() is executed with interrupts and
> preemption enabled, so there is a non-trivial chance that the thread
> already migrated.
>
> In fact, I do not see a need to disable migration for the thread if user
> callchain is planned to be gathered. You only need to remember the cpu
> where the interrupt occured, to match it against the request.  Or are
> per-cpu PMC registers still accessed during callchain collection ?

The buffers are pcpu. Although it would in principle be safe in this
case since I
don't modify the read/write indices. However, I'd have to add another field for
the CPU and it doesn't handle the case of multiple migrations.

>
>> +int
>> +pmc_process_interrupt(int cpu, int ring, struct pmc *pm, struct trapframe *tf,
>> +    int inuserspace)
>> +{
>> +     struct thread *td;
>> +
>> +     td = curthread;
>> +     if ((pm->pm_flags & PMC_F_USERCALLCHAIN) &&
>> +             td && td->td_proc &&
>> +             (td->td_proc->p_flag & P_KPROC) == 0 &&
>> +             !inuserspace) {
> I am curious why a lot of the pmc code checks for curthread != NULL and,
> like this fragment, for curproc != NULL.  I am sure that at least on x86,
> we never let curthread point to the garbage, even during the context
> switches.  NMI handler has the same cargo-cult check, BTW.

I didn't think they could be NULL, but have been cargo culting the
existing code.

> Also, please fix the indentation of the conditions block.

>
>> +             atomic_add_int(&curthread->td_pmcpend, 1);
> You can use atomic_store_int() there, I believe,  Then there would be
> no locked op executed at all, on x86.

Storing a 1 would enable me to early terminate the loop.

>
>> @@ -375,6 +375,7 @@ struct thread {
>>       void            *td_lkpi_task;  /* LinuxKPI task struct pointer */
>>       TAILQ_ENTRY(thread) td_epochq;  /* (t) Epoch queue. */
>>       epoch_section_t td_epoch_section; /* (t) epoch section object */
>> +     int             td_pmcpend;
> Why this member was not put into the zeroed region ?  Wouldn't a garbage
> there cause uneccessary ASTs ?

It would cause _1_ unnecessary check for callchains after initial
creation. Putting it in the zero area would break the ABI.