Weekly status report (27th May)

Takuya ASADA syuu at dokukino.com
Tue May 31 11:31:13 UTC 2011


Sorry for delaying weekly status report,

* Overview
Here are progress of the project:
 - Implement set affinity ioctl on BPF
  Experimental code are implemented, worked
 - Implement affinity support on bpf_tap/bpf_mtap/bpf_mtap2
  Experimental code are implemented, worked
 - Implement sample application
  Quick hack for tcpdump/libpcap, worked
 - Implement multi-queue tap driver
  Experimental core are implemented, not tested
 - Implement interface to deliver queue information on network device driver
  Partially implemented on igb(4), not tested
 - Reduce lock granularity on bpf_tap/bpf_mtap/bpf_mtap2
  Not yet
 - Implement test case
  Not yet
 - Update man document, write description of sample code
  Not yet

* Detail
On an ethernet card, bpf_mtap is called when RX/TX are performing.
If the card supports multiqueue, every packets through bpf_mtap should
belong to RX queue id or TX queue id.
To handle this, I defined new members on mbuf pkthdr.

In if_start function on igb(4), I added following line:
  m->m_pkthdr.rxqid = (uint32_t)-1;
  m->m_pkthdr.txqid = [tx queue id];
And also receive function:
  m->m_pkthdr.rxqid = [rx queue id];
  m->m_pkthdr.txqid = (uint32_t)-1;

Then I define following members on bpf descriptor:
  d->bd_qmask.qm_enabled
  d->bd_qmask.qm_rxq_mask[]
  d->bd_qmask.qm_txq_mask[]

Since qm_rxq_mask[] and qm_txq_mask[] size may differ on each cards,
we need to pass size of queue from driver to bpf and allocate arrays
by the size.
I added them on struct ifnet:
  d->bd_bif->bif_ifp->if_rxq_num
  d->bd_bif->bif_ifp->if_txq_num

Now we can filter unwanted packet on bpf_mtap like this:

LIST_FOREACH(d, &bp->bif_dlist, bd_next) {
  if (d->bd_qmask.qm_enabled) {
    if (m->m_pkthdr.rxqid != (uint32_t)-1 &&
!d->bd_qmask.qm_rxq_mask[m->m_pkthdr.rxqid])
      continue;
    if (m->m_pkthdr.txqid != (uint32_t)-1 &&
!d->bd_qmask.qm_txq_mask[m->m_pkthdr.txqid])
      continue;
}
d->bd_qmask.qm_enabled should FALSE by default to keep compatibility
with existing applications.

And here are ioctls for set/get queue mask:
  #define BIOCENAQMASK    _IO('B', 137)
    This does d->bd_qmask.qm_enabled = TRUE
  #define BIOCDISQMASK    _IO('B', 138)
    This does d->bd_qmask.qm_enabled = FALSE
  #define BIOCRXQLEN      _IOR('B', 133, int)
    Returns ifp->if_rxq_num
  #define BIOCTXQLEN      _IOR('B', 134, int)
    Returns ifp->if_txq_num
  #define BIOCSTRXQMASK   _IOWR('B', 139, uint32_t)
    This does d->bd_qmask.qm_rxq_mask[*addr] = TRUE
  #define BIOCGTRXQMASK   _IOR('B', 140, uint32_t)
    Returns d->bd_qmask.qm_rxq_mask[*addr]
  /* XXX: We should have rxq_mask[*addr] = FALSE ioctl too */
  #define BIOCSTTXQMASK   _IOWR('B', 141, uint32_t)
    This does d->bd_qmask.qm_txq_mask[*addr] = TRUE
  /* XXX: We should have txq_mask[*addr] = FALSE ioctl too */
  #define BIOCGTTXQMASK   _IOR('B', 142, uint32_t)
    Returns d->bd_qmask.qm_rxq_mask[*addr]

However, the packet which comes bpf_tap doesn't have mbuf, we won't
able to classify queue id for it.
So I added d->bd_qmask.qm_other_mask and BIOSTOTHERMASK/BIOGTOTHERMASK for them.
If d->bd_qmask.qm_enabled && !d->bd_qmask.qm_other_mask, all packets
through bpf_tap will be ignored.

If we only care about CPU affinity of packet / thread(= bpf
descriptor), checking PCPU_GET(cpuid) is enough.
But if we want to take care queue affinity, we probably need
structures as referred to above.

* Argument
I discussed about this project with some Japanese BSD hackers, they
argue this plan, suggested me two things:

- Isn't it possible to filter by queue id in BPF filter language by extend it?

- Do we really need to expose queue information and threads to user
applications?
Probably most of BPF application requires to merge packet streams from
threads at last.
For example, sniffer app such as tcpdump and wireshark need to output
packet dump on a screen, before output it on the screen we need to
merge packet streams for each queues into one stream.
If so, isn't it better to merge stream in kernel, not userland?


I'm not really sure about use case of BPF, maybe there's use case can
get benefit from multithreaded BPF?

syuu


More information about the soc-status mailing list