svn commit: r339313 - in head: share/ctypedef tools/tools/locale tools/tools/locale/etc

Yuri Pankov yuripv at FreeBSD.org
Thu Oct 11 18:30:14 UTC 2018


Author: yuripv
Date: Thu Oct 11 18:30:12 2018
New Revision: 339313
URL: https://svnweb.freebsd.org/changeset/base/339313

Log:
  Restore some of the ctype definitions reported in the PR from pre-CLDR
  data, namely 0xE000-0xF8FF private use area, and 0xFF00-0xFFF half- and
  fullwidth punctuation.
  
  While here, update tools/tools/locale/README based on my experience
  rebuilding the locale data.
  
  PR:		225692
  Reviewed by:	bapt, cem (previous version)
  Approved by:	re (gjb), kib (mentor)
  Differential Revision:	https://reviews.freebsd.org/D17471

Modified:
  head/share/ctypedef/en_US.UTF-8.src
  head/tools/tools/locale/README
  head/tools/tools/locale/etc/common.UTF-8.src
  head/tools/tools/locale/etc/manual-input.UTF-8

Modified: head/share/ctypedef/en_US.UTF-8.src
==============================================================================
--- head/share/ctypedef/en_US.UTF-8.src	Thu Oct 11 18:27:19 2018	(r339312)
+++ head/share/ctypedef/en_US.UTF-8.src	Thu Oct 11 18:30:12 2018	(r339313)
@@ -6241,6 +6241,12 @@ graph	<MEETEI_MAYEK_LETTER_KOK>;...;<MEETEI_MAYEK_APUN
 digit	<MEETEI_MAYEK_DIGIT_ZERO>;...;<MEETEI_MAYEK_DIGIT_NINE>
 
 **********************************************************************
+* 0xE000 - 0xF8FF Private Use Area (from pre-CLDR data)
+**********************************************************************
+
+graph	<PRIVATE_USE_AREA-E000>;...;<PRIVATE_USE_AREA-F8FF>
+
+**********************************************************************
 * 0xFB50 - 0xFDFF Arabic Presentation Forms (differential)
 **********************************************************************
 
@@ -6277,6 +6283,17 @@ punct	<SMALL_COMMA>;...;<SMALL_COMMERCIAL_AT>
 **********************************************************************
 
 blank	<ZERO_WIDTH_NO-BREAK_SPACE>
+
+**********************************************************************
+* 0xFF00 - 0xFFFF Half- and Fullwidth Punctuation (from pre-CLDR data)
+**********************************************************************
+
+punct	<FULLWIDTH_EXCLAMATION_MARK>;...;<FULLWIDTH_SOLIDUS>;/
+	<FULLWIDTH_COLON>;...;<FULLWIDTH_COMMERCIAL_AT>;/
+	<FULLWIDTH_LEFT_SQUARE_BRACKET>;...;<FULLWIDTH_GRAVE_ACCENT>;/
+	<FULLWIDTH_LEFT_CURLY_BRACKET>;...;<HALFWIDTH_KATAKANA_MIDDLE_DOT>;/
+	<FULLWIDTH_CENT_SIGN>;...;<FULLWIDTH_WON_SIGN>;/
+	<HALFWIDTH_FORMS_LIGHT_VERTICAL>;...;<HALFWIDTH_WHITE_CIRCLE>
 
 **********************************************************************
 * 0x10300 - 0x1032F Old Italic

Modified: head/tools/tools/locale/README
==============================================================================
--- head/tools/tools/locale/README	Thu Oct 11 18:27:19 2018	(r339312)
+++ head/tools/tools/locale/README	Thu Oct 11 18:30:12 2018	(r339313)
@@ -2,23 +2,37 @@
 
 To generate the locales:
 
-Tools needed: java, perl, devel/p5-Tie-IxHash, converters/p5-Text-Iconv and
-textproc/p5-XML-Parser
+Tools needed:
+	java (openjdk >= 8)
+	perl
+	converters/p5-Text-Iconv
+	devel/p5-Tie-IxHash
+	textproc/p5-XML-Parser
 
-fetch cldr data from: http://cldr.unicode.org
-extract in a directory ~/unicode/cldr/v30.0.3 for example
-fetch unidata from http://www.unicode.org/Public/zipped/ (latest version)
-extract in a directory ~/unicode/UNIDATA/9.0.0 for example
+Fetch CLDR data from: http://unicode.org/Public/cldr/.  You need all of the
+core.zip, keyboards.zip, and tools.zip.
 
-Note that the prebuilt cldr tools are not working on freebsd, it needs to
-be rebuilt:
-cd $CLDRDIR/tools/java
-ant build
+Extract:
+	mkdir -p ~/unicode/cldr/v33.0
+	cd ~/unicode/cldr/v33.0
+	unzip ~/core.zip ~/keyboards.zip ~/tools.zip
 
-either modify tools/tools/locales/etc/unicode.conf or export variables:
-CLDRDIR="~/unicode/cldr/v30.0.3"
-UNIDATADIR="~/unicode/UNIDATA/9.0.0"
+Fetch unidata (UCD.zip) from http://www.unicode.org/Public/zipped/latest.
 
-run:
-make POSIX
-make install
+Extract:
+	mkdir -p ~/unicode/UNIDATA/11.0.0
+	cd ~/unicode/UNIDATA/11.0.0
+	unzip ~/UCD.zip
+
+Either modify tools/tools/locales/etc/unicode.conf or export variables:
+	CLDRDIR=~/unicode/cldr/v33.0; export CLDRDIR
+	UNIDATADIR=~/unicode/UNIDATA/9.0.0; export UNIDATADIR
+
+Build the CLDR tools:
+	cd $CLDRDIR/tools/java
+	ant jar
+
+Run:
+	make POSIX
+	make
+	make install

Modified: head/tools/tools/locale/etc/common.UTF-8.src
==============================================================================
--- head/tools/tools/locale/etc/common.UTF-8.src	Thu Oct 11 18:27:19 2018	(r339312)
+++ head/tools/tools/locale/etc/common.UTF-8.src	Thu Oct 11 18:30:12 2018	(r339313)
@@ -6241,6 +6241,12 @@ graph	<MEETEI_MAYEK_LETTER_KOK>;...;<MEETEI_MAYEK_APUN
 digit	<MEETEI_MAYEK_DIGIT_ZERO>;...;<MEETEI_MAYEK_DIGIT_NINE>
 
 **********************************************************************
+* 0xE000 - 0xF8FF Private Use Area (from pre-CLDR data)
+**********************************************************************
+
+graph	<PRIVATE_USE_AREA-E000>;...;<PRIVATE_USE_AREA-F8FF>
+
+**********************************************************************
 * 0xFB50 - 0xFDFF Arabic Presentation Forms (differential)
 **********************************************************************
 
@@ -6277,6 +6283,17 @@ punct	<SMALL_COMMA>;...;<SMALL_COMMERCIAL_AT>
 **********************************************************************
 
 blank	<ZERO_WIDTH_NO-BREAK_SPACE>
+
+**********************************************************************
+* 0xFF00 - 0xFFFF Half- and Fullwidth Punctuation (from pre-CLDR data)
+**********************************************************************
+
+punct	<FULLWIDTH_EXCLAMATION_MARK>;...;<FULLWIDTH_SOLIDUS>;/
+	<FULLWIDTH_COLON>;...;<FULLWIDTH_COMMERCIAL_AT>;/
+	<FULLWIDTH_LEFT_SQUARE_BRACKET>;...;<FULLWIDTH_GRAVE_ACCENT>;/
+	<FULLWIDTH_LEFT_CURLY_BRACKET>;...;<HALFWIDTH_KATAKANA_MIDDLE_DOT>;/
+	<FULLWIDTH_CENT_SIGN>;...;<FULLWIDTH_WON_SIGN>;/
+	<HALFWIDTH_FORMS_LIGHT_VERTICAL>;...;<HALFWIDTH_WHITE_CIRCLE>
 
 **********************************************************************
 * 0x10300 - 0x1032F Old Italic

Modified: head/tools/tools/locale/etc/manual-input.UTF-8
==============================================================================
--- head/tools/tools/locale/etc/manual-input.UTF-8	Thu Oct 11 18:27:19 2018	(r339312)
+++ head/tools/tools/locale/etc/manual-input.UTF-8	Thu Oct 11 18:30:12 2018	(r339313)
@@ -877,6 +877,12 @@ graph	<MEETEI_MAYEK_LETTER_KOK>;...;<MEETEI_MAYEK_APUN
 digit	<MEETEI_MAYEK_DIGIT_ZERO>;...;<MEETEI_MAYEK_DIGIT_NINE>
 
 **********************************************************************
+* 0xE000 - 0xF8FF Private Use Area (from pre-CLDR data)
+**********************************************************************
+
+graph	<PRIVATE_USE_AREA-E000>;...;<PRIVATE_USE_AREA-F8FF>
+
+**********************************************************************
 * 0xFB50 - 0xFDFF Arabic Presentation Forms (differential)
 **********************************************************************
 
@@ -913,6 +919,17 @@ punct	<SMALL_COMMA>;...;<SMALL_COMMERCIAL_AT>
 **********************************************************************
 
 blank	<ZERO_WIDTH_NO-BREAK_SPACE>
+
+**********************************************************************
+* 0xFF00 - 0xFFFF Half- and Fullwidth Punctuation (from pre-CLDR data)
+**********************************************************************
+
+punct	<FULLWIDTH_EXCLAMATION_MARK>;...;<FULLWIDTH_SOLIDUS>;/
+	<FULLWIDTH_COLON>;...;<FULLWIDTH_COMMERCIAL_AT>;/
+	<FULLWIDTH_LEFT_SQUARE_BRACKET>;...;<FULLWIDTH_GRAVE_ACCENT>;/
+	<FULLWIDTH_LEFT_CURLY_BRACKET>;...;<HALFWIDTH_KATAKANA_MIDDLE_DOT>;/
+	<FULLWIDTH_CENT_SIGN>;...;<FULLWIDTH_WON_SIGN>;/
+	<HALFWIDTH_FORMS_LIGHT_VERTICAL>;...;<HALFWIDTH_WHITE_CIRCLE>
 
 **********************************************************************
 * 0x10300 - 0x1032F Old Italic


More information about the svn-src-head mailing list