Report #6: Unicode support

Dmitry Selyutin ghostman.sd at gmail.com
Wed Aug 6 01:30:12 UTC 2014


Hello everyone!

Here are the last news about the Unicode support project[0]. You can
always check my repository[1].
During these days I've been working on storing data for Unicode
Collation in the more appropriate format than it was before (strange
tables with binary search right in C source files). According to one
of Pedro's suggestions, I've used <db.h> types and functions to finish
it.
I've tried to achive portability for all platforms that support <db.h>
in the way that FreeBSD does (I took care of some subtle things like
endianness too). Full set of functions to work with collation
databases is provided as well as Python bindings (they were written
while creating CLDR database, but seemed to be so useful that I
decided to commit them too). Right now code lives under lib/libcolldb
directory, though it seems there may be a better place for it
(especially for Python bindings). Any suggestions? I'd like to leave
this stuff visible (first I wanted to leave it hidden in
xlocale_private.h, but I found it really useful) for other developers,
but the first what came to my mind was library.
I was too tired to rewrite all existing functions to make them support
collation databases; I hope to finish it tomorrow.
Normalization and canonicalization parts are already done; as it
seems, collation itself is also nearing completion, though there is
still much to be done.
I'd like to thank Pedro and especially Konrad Jankowski, who found the
strength to return to his project and gave a helpful hand (and gives
right now).
There is still much to be done: since I got hooked by this part of
work, I couldn't respond to all Pedro's and Konrad's mails during
these days.
So the nearest targets are to rewrite collation algorithms again to
let them work and to begin testing.

[0] https://wiki.freebsd.org/SummerOfCode2014/Unicode
[1] https://socsvn.freebsd.org/socsvn/soc2014/ghostmansd

-- 
With best regards,
Dmitry Selyutin


More information about the soc-status mailing list