Translations (was Re: svn commit: r43974 - head/en_US.ISO8859-1/books/handbook/advanced-networking)

Benedict Reuschling bcr at FreeBSD.org
Tue Feb 18 09:13:16 UTC 2014


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hi guys,

thanks for your interest in translation. I think that with the
improvements that we can make, we can also make it easier for doc
committers making changes to the doc not worry too much about
translations with regards to whitespace. I could be wrong, but I think
we could combine whitespace and content changes in the future. For
now, my primary focus is on getting the translation process more
automated and easier to catch up.
>>> If there is any way I or other committers can make this easier
>>> for translators, please post on the -doc mailing list or to me 
>>> invididually.
>> 
>> One thing I would recommend is to separate the contents that do
>> not need translation (e.g. tags, entities, etc.) and contents
>> that needs translation.  This way, these contents would serve as
>> a positioning blocks when merging from upstream (English).
> 
> That sounds interesting, and leads well into the next part:
> 
>>> We are also trying to modernize the translation process, and 
>>> automate some of the work that translators are currently forced
>>> to do.  Anyone who would like to help with that is welcome.
>> 
>> That would be great!  How can we help, or is there some kind of
>> TODO list?
> 

Well, I have a couple of loose threads that we need to tie together to
have a complete and mostly automated system.

> There are several things that would be helpful.
> 
> We could use help from people who are experienced with using the 
> .po/.pot/.mo tools (gettext) on other platforms.
> 
> The basic process is to separate all the content from the markup 
> automatically.  Then an editor can be used to add translations, and
> the tool puts a translated file back together from it.  This allows
> the translation program to remember existing translations, so
> translating one document helps translate others.
> 
Yes, that is a good description. I've put a (hopefully) more thorough
description on the wiki page below.

> textproc/itstool is one of the automatic separator programs.  I've
> been somewhat stymied trying to figure out how to get the Python
> libxml2 implementation used by it to find FreeBSD documentation XML
> catalogs. Documentation on this is... let's just say sparse.
> 

Another tool is textproc/po4a that Thomas Abthorpe showed me on one of
the hacker lounges two years ago in Canada. He set up a test project
for a translation into en_GB to see how these tools fit our toolchain.

He send me the following mini-howto, which I extended a bit for a
first commit to the german doc repo recently:

<mini-howto>
install textproc/po4a
install editors/poedit # optional
install devel/gtranslator #optional
install lokalize # somewhere in kde4, too lazy to look

To prep a new translation, inline, same directory the following works.
Substitute en_GB for the language of your choice.

po4a-gettextize -f xml -m article.sgml -p article.pot # renders to pot
file
#<do translation>, save as article_en_GB.po
po4a-translate -f xml -m article.sgml -p article_en_GB.po -l
article_en_GB.sgml

Trying to convert to po files with existing translations, the
following should
work providing the sgmlised files match tag for tag

po4a-gettextize -M ISO8859-1 -f xml -m
../../../en_US.ISO8859-1/articles/<article>/article.sgml -l
article.sgml -p article_de_DE.po
</mini-howto>

The biggest problem right that I came across is that after the
translated strings are put back into the xml document, the formatting
is screwed up as the po tools don't know our rules. So, it would we
really cool if we had a tool that does this for us that we could run
after the po4a-translate step.

Another issue is that when I tried po4a-gettextize to extract the
strings from an already existing translation the tool bailed out early
because it can only match string by string. Since the translation and
the english original doc have drifted too far apart and sometimes you
would make two sentences for one english sentence, the tool cannot
cope with it. I don't have a solution for this yet, but it would be a
shame to throw away the already existing translations and not use them
to feed a translation memory.

> The PC-BSD folks are using tools like Pootle, but not for DocBook
> (as far as I know).  They have a web site for translation, useful
> as an example of what can be done: http://pootle.pcbsd.org/
> 
> Benedict (CCed) has made some progress with some of these tools.
> I think there are plans to add a page to the wiki, but don't know
> if it is present yet.

I put everything in the DevSummit wiki page and will start a separate
one with the results from the summit to continue from there:

https://wiki.freebsd.org/201405DevSummit/Translation

Regards

Benedict Reuschling
Documentation Committer
The FreeBSD Project
The FreeBSD Documentation Project
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
Comment: GPGTools - https://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBCgAGBQJTAyQbAAoJEAQa31nbPD2LLVsH/1DzRh3y2Wm5YTfo+OzmQTz9
9XKZTTaP3kF20HqGamMPOgEJdY8Szo6sBNJyPoPtkzBWHcsUv7s2W49MGwIRkdwc
yACdIPrladNXc77BpgwNsqoWGLsSElllqe2vYQNlICUrZbqgHC2a52eTtOi1QJZ5
P98M7wDtHFaxLdxKb17qBaL9/mvx5HhLXc6ZwP1GbKHvPWUtTN08H/BiGRc5cy1s
yTPqP/xmQNm7l3VTURfMqyuJykG+nbxpr6FvupDSo25rDfbxi0z+wcUeR1c7hLol
Vsh2p3Spx48Cc0Ccfj4vP+4Bdt6eiJ/O8jsofroZZKk1l+zBMhxbuVt3TugVgX4=
=u0pT
-----END PGP SIGNATURE-----


More information about the freebsd-doc mailing list