svn commit: r197754 - in user/edwin/locale: . usr.bin/unicode2utf8

Edwin Groothuis edwin at FreeBSD.org
Sun Oct 4 21:07:20 UTC 2009


Author: edwin
Date: Sun Oct  4 21:07:19 2009
New Revision: 197754
URL: http://svn.freebsd.org/changeset/base/197754

Log:
  Add man-page for the unicode2utf8 utility.
  Fix wording of the examples, the ru_RU word was Saturday, not Sunday.

Added:
  user/edwin/locale/usr.bin/unicode2utf8/unicode2utf8.1
Modified:
  user/edwin/locale/README.locale

Modified: user/edwin/locale/README.locale
==============================================================================
--- user/edwin/locale/README.locale	Sun Oct  4 19:44:41 2009	(r197753)
+++ user/edwin/locale/README.locale	Sun Oct  4 21:07:19 2009	(r197754)
@@ -46,18 +46,19 @@ Gotchas
 Examples
 --------
 
-The word for the last day of the week in the en_US language - country
-code would be in Unicode format:
-    <LATIN CAPITAL LETTER S><LATIN SMALL LETTER U>
-    <LATIN SMALL LETTER N><LATIN SMALL LETTER D>
+The word for the second last day of the week in the en_US language
+- country code would be in Unicode format:
+    <LATIN CAPITAL LETTER S><LATIN SMALL LETTER A>
+    <LATIN SMALL LETTER T><LATIN SMALL LETTER U>
+    <LATIN SMALL LETTER R><LATIN SMALL LETTER D>
     <LATIN SMALL LETTER A><LATIN SMALL LETTER Y>
 Converted into UTF-8 this will be:
-    Sunday
+    Saturday
 Converted into ISO-8859 this will be:
-    Sunday
+    Saturday
 
-The word for the last day of the week in the ru_RU language -
-country code would be in Unicode format:
+The word for the second last day of the week in the ru_RU language
+- country code would be in Unicode format:
     <CYRILLIC SMALL LETTER ES><CYRILLIC SMALL LETTER U>
     <CYRILLIC SMALL LETTER BE><CYRILLIC SMALL LETTER BE>
     <CYRILLIC SMALL LETTER O><CYRILLIC SMALL LETTER TE>

Added: user/edwin/locale/usr.bin/unicode2utf8/unicode2utf8.1
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ user/edwin/locale/usr.bin/unicode2utf8/unicode2utf8.1	Sun Oct  4 21:07:19 2009	(r197754)
@@ -0,0 +1,91 @@
+.\" Copyright (c) 2009 Edwin Groothuis <edwin at FreeBSD.org>
+.\" All rights reserved.
+.\"
+.\" Redistribution and use in source and binary forms, with or without
+.\" modification, are permitted provided that the following conditions
+.\" are met:
+.\" 1. Redistributions of source code must retain the above copyright
+.\"    notice, this list of conditions and the following disclaimer.
+.\" 2. Redistributions in binary form must reproduce the above copyright
+.\"    notice, this list of conditions and the following disclaimer in the
+.\"    documentation and/or other materials provided with the distribution.
+.\"
+.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+.\" SUCH DAMAGE.
+.\"
+.\" $FreeBSD$
+.\"
+.Dd October 4, 2009
+.Dt unicode2utf8 1
+.Os
+.Sh NAME
+.Nm unicode2utf8
+.Nd converts a file with Unicode name definitions into UTF-8 character
+definitions.
+.Sh SYNOPSIS
+.Nm
+.Fl -cldr Ar directory
+.Fl -input Ar filename
+.Fl -output Ar filename
+.Sh DESCRIPTION
+The
+.Nm
+utility is made to convert the Unicode encoded strings in the
+contents of the specified input file with the corresponding UTF-8
+character definitions.
+.Pp
+Lines starting with a # are copied as-is.
+.Pp
+The Unicode encoded strings are specified between a '<' and a '>'
+sign.
+They are looked up against the keys in the conversion table specified
+in the file
+.Pa posix/UTF-8.cm
+in the with
+.Fl -cldr
+defined directory and the matching value is written out.
+.Pp
+Other characters are copied as-is.
+.Sh OPTIONS
+.Bl -tag -width indent
+.It Fl -cldr Ar directory
+The directory where the file
+.Pa posix/UTF-8.cm
+resides.
+By default this should point to
+.Pa /usr/share/misc ,
+but for maintainers of the FreeBSD locale database this could point
+to their own extracted copy of the CLDR database.
+.It Fl -input Ar filename
+The source file with the Unicode encoded strings.
+.It Fl -output Ar filename
+The destination file with the Unicode encoded strings replaced with
+their UTF-8 equivalents.
+.El
+.Sh EXIT STATUS
+The
+.Nm
+utility exits 0 on success, and >0 if an error occurs.
+.Sh SEE ALSO
+.Xr iconv 1 ,
+.Xr bsdiconv 1
+.Bl -tag -width indent
+.It http://cldr.unicode.org/
+Website of the Common Locale Database Repository,
+the maintainers of the file
+.Pa /usr/share/misc/UTF-8.cm
+.El
+.Sh AUTHORS
+The
+.Nm
+utility and this manual page were written by
+.An Edwin Groothuis Aq edwin at FreeBSD.org .


More information about the svn-src-user mailing list