Tracking CLDR version in collation definitions
munro at ip9.org
Thu Sep 6 13:55:14 UTC 2018
Hello FreeBSD hackers,
An occasional problem run into by PostgreSQL users (and probably other
database-like systems) is that collation definitions change and
on-disk indexes become corrupted. This was one motivation for
PostgreSQL to adopt optional support for ICU, and to track
ucol_getVersion() and detect when it changes so that the user can be
warned that dependent indexes need to be rebuilt. However, for
various reason many users prefer to use the OS collation support,
which remains the default, and PostgreSQL supports both ways.
I'd like to be able to track collation definition versions for libc
collations too. There doesn't currently seem to be a good way to do
that. Am I missing something?
Here's the idea I had:
1. Add a new option -V to localedef(1) so that an arbitrary version
string can be stored in some spare space in the header of LC_COLLATE
2. Add a new libc function: const char *querylocaleversion(int mask,
3. Modify the perl scripts under tools/tools/locale/tools/... to
invoke localedef(1) either with a version set by the maintainer in
unicode.conf (eg "30.0.3"), or perhaps extracted from CLDR data files
I've attached a proof-of-concept patch which has a very rough
implementation of steps 1 and 2. It probably needs better bounds
checking, more thought about how to report lack of version string (""
or NULL?), and other details. Before doing any further work on that I
thought I'd check if people think the idea has legs, or knows of an
existing way to get this information.
I also considered less invasive approaches to detect collation
changes: using a checksum (ie program needs to know how to find the
LC_COLLATE files), or using the FreeBSD version on the basis that
collations should only change when the base system is upgraded
(generating false positives). I don't really like those approaches
I'd be grateful for any feedback, flames etc.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 12511 bytes
Desc: not available
More information about the freebsd-hackers