[REVIEW/TEST] polling(4) changes

Fri Oct 7 03:18:44 PDT 2005

> d> > d> Seems to be a first considerable step regarding the ideas discussed in March :)
> d> > d> But, my idea about the separate locking of each interface dissappeared from this implementation. mtx_poll is good to protect the pollrec array and other sensitive variables. But we could get advantage of SMP machines writing polling loops like this:
> d> > d> 
> d> > d> for( i = 0; i < poll_handlers; ++i ) {
> d> > d>   mtx_lock( &iface_lock[i] );
> d> > d>   pr[i].handler(pr[i].ifp, POLL_ONLY, count);
> d> > d>   mtx_unlock( &iface_lock[i] );
> d> > d> }
> d> > 
> d> > What is the benefit here? The driver must have its own lock.
> d> 
> d> Well, consider the absense of the mtx_poll lock:
> d> 
> d> - mtx_lock( &mtx_poll );
> d>   for( i = 0; i < poll_handlers; ++i ) {
> d> +   mtx_lock( &iface_lock[i] );
> d>     pr[i].handler( pr[i].ifp, POLL_ONLY, count );
> d> +   mtx_unlock( &iface_lock[i] );
> d>   }
> d> - mtx_unlock( &mtx_poll );
> d> 
> d> So, several kernel threads in an SMP machine can poll different interfaces simultaneously. And mtx_lock should only be used in ether_poll_[de]register().
> 
> Imagining that we will have several polling threads in future, the above design
> has some disadvantages, I think:
> 
> First, we still need to protect the array pr[], with some mutex while traversing
> it, and while editing it in ether_poll_[de]register. May be like it was done in
> kern_poll.c, rev 1.21.
> 
> Second, the approach above won't give a nice parallelization. Imagine two threads,
> both working in a cycle shown above. They will contest on the lock of each interface:
> 
> - t1 starts
> - t1 locks iface_lock[1]	- t2 starts
> - t1 polls pr[1]...		- t2 blocks on iface_lock[1]
> - t1 polls pr[1]...
> - t1 polls pr[1]...
> - t1 polls pr[1]...
> - t1 polls pr[1]...
> - t1 unlocks iface_lock[1]	- t2 locks iface_lock[1]
> - t1 locks iface_lock[2]	- t2 polls empty pr[1], quickly returns
> - t1 polls pr[2]...		- t2 unlocks iface_lock[1]
> - t1 polls pr[2]...		- t2 blocks on iface_lock[2]
> - t1 polls pr[2]...
> - t1 polls pr[2]...
> - t1 polls pr[2]...
> - t1 polls pr[2]...
> - t1 unlocks iface_lock[2]	- t2 locks iface_lock[2]
> - t1 locks iface_lock[3]	- t2 polls empty pr[2], quickly returns
> - t1 polls pr[3]...		- t2 unlocks iface_lock[2]
> 
> So, one thread works, and other just goes after the first one, and picks
> only a small number of packets, or even just wastes CPU cycles.

The loop body should really look like
  if( mtx_try_lock( &iface_lock[i] ) ) {
    pr[i].handler( pr[i].ifp, POLL_ONLY, count );
    mtx_unlock( &iface_lock[i] );
  }
I skipped this first to make the idea clearer.

> Really we do not have several kernel threads in polling. netisr_poll() is always
> run by one thread - swi1:net. Well, we have also idle_poll thread, but it is
> very special case. Frankly speaking, it can't work without help from netisr_poll().
> The current polling is designed for a single threaded kernel, for RELENG_4. We
> can't achieve parallelization with strong redesign. The future plans are to create
> per-interface CPU bound threads. The plans can change. You are welcome to help.

idle_poll can significantly increase network response time. I'd suggest per-CPU (not per-interface) threads. This would keep user_frac code much simpler. Not sure about the coding help in the next weeks. My current project is on the pre-release stage and the kid is going to be born soon. I can join a bit later though.