AW: BSD video capture emulation question

Mon Jul 14 08:30:22 PDT 2003

On Sun, 13 Jul 2003, John-Mark Gurney wrote:
>
> If the mbuf code seperated it out, this MIGHT be a possibility, I'm
> pretty sure we don't want to run into some of the problem with resource
> contention on mbufs..  Also, the buffer space may need more fine
> management...  The idea of a sink sending a packet that a source fills
> also is kinda wierd (necessary for some dma operations)..

There seems to be some lack of clarity in these discussions about what
level of API you are trying to create.  There's at least two
possibilities:

a) The low-level API for shifting bulk data and timing information
   between hardware devices and/or processing modules.  Here the
   device drivers and encoders/decoders are the providers and consumers
   of the API, and we're inevitably talking about a kernel interface.

b) A higher level API to control the 'plumbing'.  Here, user-interface
   programs are the consumer of the API, with the details of the
   bulk transfer mechanisms being hidden below the API.

This talk of Netgraph etc. seems to be addressing problem a), while I
thought you were originally talking about b).

If we are talking about a), I'd argue strongly that a single solution is
unlikely to fit all cases, and in particular that raw video needs to be
treated differently from compressed.

Raw video:
   - Has very high bandwidth requirements (165Mbit/sec for D1 resolution)
     This requires short-cut routings wherever possible, and flow control
     isn't really practical: you typically want to discard whole frames
     rather than buffering them when things can't keep up.  mbufs are
     probably not appropriate - a constrained number of whole-field
     buffers is more useful, with an mmap() style interface if the
     data absolutely has to pass into user-space.

   - Needs to negotiate format options in advance and supply
     data in the right format; having an intermediate module to
     fix up formatting (byte order, colourspace transform etc.)
     is hugely inefficient compared to having it generated in the
     right format in the first place.  Even when this can't be
     achieved 'for free' by programming the hardware correctly,
     you really want to integrate it at one end or the other -
     for example, if you need to do colourspace transformation
     in software, you want to have the MPEG decoder do it while
     it's creating the pixels rather than the cache-busting effect
     of having it write those pixels to RAM and have another process
     come along and transform them asynchronously.

   - Typically has the timing information implicit: buffers full of data
     arrive from the source or are presented to the display in real-time,
     with a known delay through the device.

Compressed video, on the other hand:

   - Is relatively low bandwidth; moving the bytes around consumes
     only a small amount of the total system resources.

   - Does not come in any fixed block size (other than 188-byte TS
     packets), so a socket/mbuf style stream interface is entirely
     appropriate.  Flow control is also useful in many (but not all)
     contexts.

   - Format conversion by filter modules is much more reasonable,
     partly because you can afford the inefficiency at the lower
     rate, but also because the kind of transforms you want
     (between PES/PS/TS etc.) are the same sort of thing that
     network stacks typically implement efficiently by twiddling
     headers in mbufs, rather than the every-pixel (and hence
     touch-every-byte) transforms needed on raw video.

   - Timing is typically signalled explicitly by timestamps in the
     data itself rather than being implied by the arrival time of
     the data.  Indeed, encoder output tends to arrive at a very
     un-smooth rate.

Then there's your stated aim that things like USB videocams shouldn't have
to be implemented with all the logic in the kernel (an aim I agree with
BTW).  So, you end up with several different APIs for the core data
transfer, with scope for a unifying higher-layer API on top.  But it's a
lot of work....