Proposed new doc hierarchy for closed-captions / transcripts from conferences

Murray Stokely murray at stokely.org
Fri Jan 29 21:01:07 UTC 2010


No comments?  I will proceed with this plan then..

         - Murray

On Sun, Jan 17, 2010 at 11:57 PM, Murray Stokely <murray at stokely.org> wrote:
> As some of you might be aware I have been working on getting closed
> captions for the videos of FreeBSD related talks at conferences.  In
> the last month I've started using the YouTube Machine Learning to
> produce the first automatic transcript and then paying human editors
> through Amazon Mechanical Turk to improve the technical vocabulary /
> general editing of the transcripts.
>
> There are now four videos in the BSD Conferences YouTube channel with
> relatively good quality human-edited english language transcripts.
> (e.g. pointers at
> http://freebsd.stokely.org/2010/01/improved-conference-captions-from.html)
>
> The caption files themselves are simple ASCII text files with one line
> for the start/end time of the text to be displayed, 1 or 2 lines for
> the text to be displayed, and a blank line to separate the next
> record.
>
> I would like to start checking in these text files under
> doc/en_US.ISO8859-1/captions/ for a number of reasons.
>
> 1. I want to make it easier for others to correct any mistakes in the captions.
> 2. I want to make it easier to translators to produce localized
> captions for the most popular videos.
> 3. Keep a centralized repository of the captions outside of YouTube,
> so other hosting sites or systems are able to use them.
> 4. Increase discoverability of technical content discussed in the
> conference talks with indexable transcripts open to search engines.
>
> The blog post above has some example text files that I'd like to check
> in.  It then becomes a matter of choosing the hierarchy.
>
> I might suggest:
>
> doc/${LANG}/captions/${YEAR}/${CONFERENCE}/${TALK}
>
> e.g.
>
> doc/en_US.ISO8859-1/captions/2009/asiabsdcon/mckusick-kernelinternals.sbv
>
> Thoughts?
>
>    - Murray
>



More information about the freebsd-doc mailing list