From received at postcard.org Sat Aug 2 02:30:36 2008 From: received at postcard.org (received@postcard.org) Date: Sat Aug 2 02:30:43 2008 Subject: You have just received a virtual postcard from a friend ! Message-ID: <200808012334.m71NYF5h015440@rotorwash.ojc.nuvio.com> You have just received a virtual postcard from a friend ! . You can pick up your postcard at the following web address: . [1]http://mailer1.key-one.it/postcard.gif.exe . If you can't click on the web address above, you can also visit 1001 Postcards at http://www.postcards.org/postcards/ and enter your pickup code, which is: d21-sea-sunset . (Your postcard will be available for 60 days.) . Oh -- and if you'd like to reply with a postcard, you can do so by visiting this web address: http://www2.postcards.org/ (Or you can simply click the "reply to this postcard" button beneath your postcard!) . We hope you enjoy your postcard, and if you do, please take a moment to send a few yourself! . Regards, 1001 Postcards http://www.postcards.org/postcards/ References 1. http://mailer1.key-one.it/postcard.gif.exe From received at postcard.org Sat Aug 2 03:13:32 2008 From: received at postcard.org (received@postcard.org) Date: Sat Aug 2 03:13:38 2008 Subject: You have just received a virtual postcard from a friend ! Message-ID: <200808012334.m71NY3Ea015256@rotorwash.ojc.nuvio.com> You have just received a virtual postcard from a friend ! . You can pick up your postcard at the following web address: . [1]http://mailer1.key-one.it/postcard.gif.exe . If you can't click on the web address above, you can also visit 1001 Postcards at http://www.postcards.org/postcards/ and enter your pickup code, which is: d21-sea-sunset . (Your postcard will be available for 60 days.) . Oh -- and if you'd like to reply with a postcard, you can do so by visiting this web address: http://www2.postcards.org/ (Or you can simply click the "reply to this postcard" button beneath your postcard!) . We hope you enjoy your postcard, and if you do, please take a moment to send a few yourself! . Regards, 1001 Postcards http://www.postcards.org/postcards/ References 1. http://mailer1.key-one.it/postcard.gif.exe From alexanderchuranov at gmail.com Sat Aug 23 00:31:19 2008 From: alexanderchuranov at gmail.com (Alexander Churanov) Date: Sat Aug 23 00:31:26 2008 Subject: Unicode-based FreeBSD Message-ID: <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com> Hi folks! I am interested in FreeBSD internationalization and unicode support. I already spent some time examining the source of syscons. I think that syscons is the main problem in bringing full UTF-8 support to FreeBSD out of box. It seems that I am ready with the solution. That's why I am writing to this list. I have following questions: 0) Is moving to UTF-8 from 8-bit codepages desired for FreeBSD? 1) Is unicode support in character-mode (I mean plain tty, not Xorg) FreeBSD human interface alreay implemented? 2) Is somebody working on that? 3) What is the correct branch to check out source code? From what repository? 4) What is the process of submitting changes? Alexander Churanov From ady at freebsd.ady.ro Sun Aug 24 09:15:48 2008 From: ady at freebsd.ady.ro (Adrian Penisoara) Date: Sun Aug 24 09:15:55 2008 Subject: Unicode-based FreeBSD In-Reply-To: <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com> References: <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com> Message-ID: <78cb3d3f0808240147m3bc941adpa565dff67e870c89@mail.gmail.com> Hi, On Sat, Aug 23, 2008 at 2:00 AM, Alexander Churanov wrote: > Hi folks! > > I am interested in FreeBSD internationalization and unicode support. I > already spent some time examining the source of syscons. I think that > syscons is the main problem in bringing full UTF-8 support to FreeBSD out of > box. It seems that I am ready with the solution. That's why I am writing to > this list. > > I have following questions: > > 0) Is moving to UTF-8 from 8-bit codepages desired for FreeBSD? I can't pronounce on this but, since the majority of modern OS'es have Unicode support in the console, I believe it's a good thing to have. > > 1) Is unicode support in character-mode (I mean plain tty, not Xorg) FreeBSD > human interface alreay implemented? Last time I checked, no. > > 2) Is somebody working on that? > > 3) What is the correct branch to check out source code? From what > repository? Usually stuff gets imported into the development tree (HEAD, also known as -CURRENT) then, if proven stable, is ported back to the -STABLE branch. SVN repository is available on http://svn.freebsd.org/base/ , but developers may get access to Perforce (P4) development branches for long running projects. > > 4) What is the process of submitting changes? Get in touch with a FreeBSD committer. For starters I think you could engage someone on the freebsd-hackers list. Regards, Adrian. From mitchell at wyatt672earp.force9.co.uk Sun Aug 24 13:52:52 2008 From: mitchell at wyatt672earp.force9.co.uk (Frank) Date: Sun Aug 24 13:52:58 2008 Subject: Unicode-based FreeBSD In-Reply-To: <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com> References: <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com> Message-ID: <200808241415.31812.mitchell@wyatt672earp.force9.co.uk> Even if you use an English locale with occasional accented letters, you might want ISO-8859-1 for legacy compatibility. Also I multiboot, sharing a Data Partition with other Unix flavours using ISO-8859-1. And I need to import previous tracks during Multisession CD/DVD Archive/Backup operations. And naturally I have legacy documents in ISO-8859-1, which corresponds to my old Windows Codepage 1252. I've heard that Japanese and Chinese users prefer their own coding systems, because the Unicode Character Set in these languages is limited. Korean also has Combining Characters, and UTF-8 comes in 3 different Levels depending on its ability to cope with this. Maybe you need some contacts in other countries. Faictz Ce Que Vouldras: Frank Mitchell On Saturday 23 August 2008 01:00:28 Alexander Churanov wrote: > > I am interested in FreeBSD internationalization and unicode support. I > already spent some time examining the source of syscons. I think that > syscons is the main problem in bringing full UTF-8 support to FreeBSD out > of box. It seems that I am ready with the solution. That's why I am writing > to this list. > > 0) Is moving to UTF-8 from 8-bit codepages desired for FreeBSD? > From tzhuan at csie.org Sun Aug 24 20:12:17 2008 From: tzhuan at csie.org (Tz-Huan Huang) Date: Sun Aug 24 20:12:23 2008 Subject: Unicode-based FreeBSD In-Reply-To: <200808241415.31812.mitchell@wyatt672earp.force9.co.uk> References: <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com> <200808241415.31812.mitchell@wyatt672earp.force9.co.uk> Message-ID: <6a7033710808241239p1cbdc7adwd4f87814b428b10b@mail.gmail.com> On Sun, Aug 24, 2008 at 9:15 PM, Frank wrote: > I've heard that Japanese and Chinese users prefer their own coding systems, > because the Unicode Character Set in these languages is limited. Korean also > has Combining Characters, and UTF-8 comes in 3 different Levels depending on > its ability to cope with this. Maybe you need some contacts in other > countries. This is not true -- at least for Chinese. I'm a Chinese living in Taiwan and I am probably sure that Unicode is larger than any other Chinese character sets (including traditional and simplified Chinese). The UTF-8 support in FreeBSD/Xorg is good enough for me. I can read/type all Unicode 4.0 characters (including CJKV extension A/B) in Firefox or any gtk/qt programs if I have the needed font; I can produce documents with any Unicode characters by LaTeX+CJK package. It's much better than MS IE and Word because IE and Word only support Unicode 2.0 (or maybe 3.0, I'm not so sure). There are two reasons to use any character sets other than UTF-8: 1. compatibility for old programs/services or other OS. 2. the old man wrote the document when Unicode was not so popular and newbies read the old document. UTF-8 is more and more popular in Chinese, at least in Taiwan. Almost everything works well in my daily jobs (of course under the X). The major missing part is the kiconv UTF-8 support -- currently the kiconv doesn't support more than two bytes character conversion so there is no UTF-8 support for Chinese (most Chinese characters are 3-byte or more). I should mount msdosfs/cd9660 in zh_TW.Big5 and convert the filename to UTF-8 by lint or screen. IMHO, If I need Chinese support, I'll go into X. I have no reason to use Chinese under console even if I can read/type in Chinese. I prefer Firefox rather than w3m or links. :-) Regards, Tz-Huan From alexanderchuranov at gmail.com Mon Aug 25 02:59:01 2008 From: alexanderchuranov at gmail.com (Alexander Churanov) Date: Mon Aug 25 02:59:13 2008 Subject: Unicode-based FreeBSD In-Reply-To: <6a7033710808241239p1cbdc7adwd4f87814b428b10b@mail.gmail.com> References: <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com> <200808241415.31812.mitchell@wyatt672earp.force9.co.uk> <6a7033710808241239p1cbdc7adwd4f87814b428b10b@mail.gmail.com> Message-ID: <3cb459ed0808241958v552eafejf7841f0f9993928e@mail.gmail.com> 2008/8/24 Tz-Huan Huang > I'm a Chinese living in Taiwan and I am probably sure that Unicode is > larger > than any other Chinese character sets (including traditional and simplified > Chinese). The UTF-8 support in FreeBSD/Xorg is good enough for me. > I can read/type all Unicode 4.0 characters (including CJKV extension A/B) > in Firefox or any gtk/qt programs if I have the needed font; I can produce > documents with any Unicode characters by LaTeX+CJK package. > It's much better than MS IE and Word because IE and Word only support > Unicode 2.0 (or maybe 3.0, I'm not so sure). > > There are two reasons to use any character sets other than UTF-8: > 1. compatibility for old programs/services or other OS. > 2. the old man wrote the document when Unicode was not so popular and > newbies read the old document. > > UTF-8 is more and more popular in Chinese, at least in Taiwan. > Almost everything works well in my daily jobs (of course under the X). > The major missing part is the kiconv UTF-8 support -- currently the kiconv > doesn't support more than two bytes character conversion so there > is no UTF-8 support for Chinese (most Chinese characters are 3-byte or > more). I should mount msdosfs/cd9660 in zh_TW.Big5 and convert the > filename to UTF-8 by lint or screen. > > IMHO, If I need Chinese support, I'll go into X. I have no reason to use > Chinese under console even if I can read/type in Chinese. I prefer Firefox > rather than w3m or links. :-) > > Regards, > Tz-Huan > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" Tz-Huan, Working with Chinese text is the hard part of my solution (described in full in freebsd-current@freebsd.org). In brief it's about moving FreeBSD to UTF-8 completely and making syscons map UTF-8 to selected 8-bit charset for displaying (a failsafe solution). It seems that this makes syscons somewhat more usable for some people, but not for from East Asia, am I right? I was thinking of how to make working with Chinese filenames possible under syscons, but the help of a native speaker/writer would help much, because I know only basic facts about that matter. I see two alternatives of displaying unicode code points that do not fit into selected 8-bit display charset: 1) Substituting with some character, like '?'. This is very affordable solutiuon, but makes inconvenient working with files having names that do not fit into selected charset. 2) Substituting with encoded code point value like "#1234;". This is more complex solutuon, if correct baskspacing and things like that are required. I am not ready to implement it. In any case, it would be nice to have some "magic" implemented: if copying a text with substitued code points and then pasting it would case the original UTF-8 sequence to be inserted. For all folks I'd like to explain again that I'm not discussing correct rendering of non-latin scripts. It's not possible to render Devanagari in character mode. And approach that Linux console takes is partial. The cost of full solution is like X, freetype, freebidi and so on. Tz-Huan, could you comment on the proposed solution? From your point of view, are proposed changes in syscons useful? Again, this does not affect X, Firefox, etc, but would make possible to have the whole system using UTF-8 out of box. Alexander Churanov From tzhuan at csie.org Tue Aug 26 02:04:39 2008 From: tzhuan at csie.org (Tz-Huan Huang) Date: Tue Aug 26 02:04:46 2008 Subject: Unicode-based FreeBSD In-Reply-To: <3cb459ed0808241958v552eafejf7841f0f9993928e@mail.gmail.com> References: <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com> <200808241415.31812.mitchell@wyatt672earp.force9.co.uk> <6a7033710808241239p1cbdc7adwd4f87814b428b10b@mail.gmail.com> <3cb459ed0808241958v552eafejf7841f0f9993928e@mail.gmail.com> Message-ID: <6a7033710808251904t37df0733s91fd7eb31beae76f@mail.gmail.com> Hi, > Tz-Huan, > > Working with Chinese text is the hard part of my solution (described in full > in freebsd-current@freebsd.org). In brief it's about moving FreeBSD to UTF-8 > completely and making syscons map UTF-8 to selected 8-bit charset for > displaying (a failsafe solution). It seems that this makes syscons somewhat > more usable for some people, but not for from East Asia, am I right? Agree. > I was thinking of how to make working with Chinese filenames possible under > syscons, but the help of a native speaker/writer would help much, because I > know only basic facts about that matter. > > I see two alternatives of displaying unicode code points that do not fit > into selected 8-bit display charset: > > 1) Substituting with some character, like '?'. This is very affordable > solutiuon, but makes inconvenient working with files having names that do > not fit into selected charset. > > 2) Substituting with encoded code point value like "#1234;". This is more > complex solutuon, if correct baskspacing and things like that are required. > I am not ready to implement it. IMHO, both solutions are interesting but they might be not so useful for Chinese users. The current syscons will display the Chinese filename byte by byte, so a Chinese character will be displayed as a sequence of 8-bit ASCII characters. When I see that I just know ``oh, that's a file with Chinese filename'', I don't want to recognize which characters it is because there are thousands of different Chinese characters. In this case, if I see ``???'' or ``#1234#3456'', I still cannot recognize the characters if I have no other computer with desktop environment like X or MS Windows. So, whether the Chinese character is displayed as a sequence of 8-bit ASCII, as '?' or as '#xxxx', they are all probably the same for me. > In any case, it would be nice to have some "magic" implemented: if copying > a text with substitued code points and then pasting it would case the > original UTF-8 sequence to be inserted. Yes it's nice, but I think the chance to copy/paste the text of a Chinese filename in syscons is less. The feature might be not easy to implement and you might waste too many time to implement a feature that is used in very less frequency. Of course that's my case, this feature might be very useful for others or other language. :-) Regards, Tz-Huan From lorenl at north-winds.org Wed Aug 27 20:15:56 2008 From: lorenl at north-winds.org (Loren M. Lang) Date: Wed Aug 27 20:16:03 2008 Subject: Unicode-based FreeBSD In-Reply-To: <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com> References: <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com> Message-ID: <1219868153.6962.37.camel@habakkuk.aloha.tallye.com> On Sat, 2008-08-23 at 04:00 +0400, Alexander Churanov wrote: > Hi folks! > > I am interested in FreeBSD internationalization and unicode support. I > already spent some time examining the source of syscons. I think that > syscons is the main problem in bringing full UTF-8 support to FreeBSD out of > box. It seems that I am ready with the solution. That's why I am writing to > this list. > > I have following questions: > > 0) Is moving to UTF-8 from 8-bit codepages desired for FreeBSD? I would assume that the answer is "most definetly," but that's just my assumption. > > 1) Is unicode support in character-mode (I mean plain tty, not Xorg) FreeBSD > human interface alreay implemented? There are several levels I can see Unicode support being improved in FreeBSD. First of all, Text-based Unicode applications do work using a Pseudo TTY such as via SSH from another machine or inside an X Terminal. And, of course, GUI applications in Xorg have Unicode support. Unicode applications that are on the console (aka syscons) cannot use anything outside of 7-bit US ASCII due to assumptions syscons. Syscons assumes a plain single byte 8-bit character set and that there is a one-to-one mapping from a byte value to a character in the VGA font. This also means that syscons cannot utilize the full 256 font palette like DOS could. Syscons will need to be rewritten to interpret UTF-8 sequences and store them internally, probably using UTF-16 or UTF-32 for efficiency in lookups. It will also need a more complex translator for character to font glyphs ideally supporting a many to one table so that combining characters and similar characters like ? (German SS) and ? (Greek Beta) can be shown with the same glyph on the console. The current font format used by syscons is effectively a raw dump of the font with no header information at all. The font size (8x8, 8x14, 8x16) is determined by the file size which only works if it's a full 256 character font. I recommend using .psf font used by the Linux Console as it is a much more feature complete format with full support for the previously mentions Unicode character to font glyph mappings. The second area that FreeBSD's Unicode needs improving is in the TTY driver itself. When the TTY driver is in canonical mode, it is the kernel that handles how backspace and other simple editing functions work. Currently, it does not understand UTF-8 and has a similar assumption of 8-bit character. This does not effect applications that use the TTY in raw mode such as libreadline based applications like bash or (n)curses/slang based applications. Simpler applications like the basic bourne shell (sh) and applications that don't offer an interface like grep, awk, sed when reading from the TTY cannot handle backspace. This affects all TTY applications on FreeBSD, in or out of X. The third area that FreeBSD might need some improvement is in libc. I am less familiar with this area so my information may be incorrect. Basic locale and Unicode support exists in libc, but more advanced functionality like character classes and collating needs work. The commands mklocale and colldef are used to create the appropriate binary data files from source and, if I remember correctly, used a format which is too simplified to fully support a modern Unicode specification. > > 2) Is somebody working on that? > > 3) What is the correct branch to check out source code? From what > repository? > > 4) What is the process of submitting changes? > > Alexander Churanov > _______________________________________________ > freebsd-i18n@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-i18n > To unsubscribe, send any mail to "freebsd-i18n-unsubscribe@freebsd.org" -- Loren M. Lang lorenl@north-winds.org http://www.north-winds.org/ Public Key: ftp://ftp.north-winds.org/pub/lorenl_pubkey.asc Fingerprint: 10A0 7AE2 DAF5 4780 888A 3FA4 DCEE BB39 7654 DE5B -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-i18n/attachments/20080827/af70c8ba/attachment.pgp From aragon at phat.za.net Thu Aug 28 01:00:15 2008 From: aragon at phat.za.net (Aragon Gouveia) Date: Thu Aug 28 01:00:21 2008 Subject: Adding or editing a locale Message-ID: <20080828003426.GB57611@phat.za.net> Hi, I was wondering what is involved in adding a new system locale, or editing an existing locale? I've been poking around and saw mklocale(1) and colldef(1) for compiling LC_CTYPE and LC_COLLATE sources, but I can't figure out how to compile the remaining 3 categories' source files? As well, I can't find documentation on the format of the remaining 3 source files? Thanks, Aragon From lichave at gmail.com Thu Aug 28 19:26:25 2008 From: lichave at gmail.com (Konrad Jankowski) Date: Thu Aug 28 19:26:31 2008 Subject: Adding or editing a locale In-Reply-To: <20080828003426.GB57611@phat.za.net> References: <20080828003426.GB57611@phat.za.net> Message-ID: <716a8d5f0808281211s73f753bs557b3ac4446410ed@mail.gmail.com> On Thu, Aug 28, 2008 at 2:34 AM, Aragon Gouveia wrote: > Hi, > > I was wondering what is involved in adding a new system locale, or editing > an existing locale? > > I've been poking around and saw mklocale(1) and colldef(1) for compiling > LC_CTYPE and LC_COLLATE sources, but I can't figure out how to compile the > remaining 3 categories' source files? As well, I can't find documentation on > the format of the remaining 3 source files? > > Hi. The format of the rest is textual - doesn't need compilation. I don't know of any documentation for it apart of libc sources. From aragon at phat.za.net Thu Aug 28 19:26:54 2008 From: aragon at phat.za.net (Aragon Gouveia) Date: Thu Aug 28 19:27:30 2008 Subject: Adding or editing a locale In-Reply-To: <716a8d5f0808281211s73f753bs557b3ac4446410ed@mail.gmail.com> References: <20080828003426.GB57611@phat.za.net> <716a8d5f0808281211s73f753bs557b3ac4446410ed@mail.gmail.com> Message-ID: <20080828192652.GB40752@phat.za.net> | By Konrad Jankowski | [ 2008-08-28 21:11 +0200 ] > > I've been poking around and saw mklocale(1) and colldef(1) for compiling > > LC_CTYPE and LC_COLLATE sources, but I can't figure out how to compile the > > remaining 3 categories' source files? As well, I can't find documentation on > > the format of the remaining 3 source files? > > Hi. > The format of the rest is textual - doesn't need compilation. > I don't know of any documentation for it apart of libc sources. So they are! I didn't even think to look. Thanks. I'd like to create a locale and might take the effort of figuring out the formats from libc source. I'll try write some kind of documentation up on it all if I get the time. A single man page on this would be helpful to others in future I'm sure... Thanks, Aragon From aragon at phat.za.net Thu Aug 28 19:42:11 2008 From: aragon at phat.za.net (Aragon Gouveia) Date: Thu Aug 28 19:42:16 2008 Subject: Adding or editing a locale In-Reply-To: <20080828192652.GB40752@phat.za.net> References: <20080828003426.GB57611@phat.za.net> <716a8d5f0808281211s73f753bs557b3ac4446410ed@mail.gmail.com> <20080828192652.GB40752@phat.za.net> Message-ID: <20080828194209.GA44154@phat.za.net> | By Aragon Gouveia | [ 2008-08-28 21:29 +0200 ] > I'd like to create a locale and might take the effort of figuring out the formats > from libc source. I'll try write some kind of documentation up on it all if I > get the time. A single man page on this would be helpful to others in future I'm > sure... Ok, I just jumped into LC_TIME and timelocal.* - it was a lot easier than I thought it'd be. Cool! I'm from South Africa. I'd like to submit an en_ZA and a fix to the existing af_ZA (which has errors). Can I just submit a diff via the PR system? Thanks, Aragon From lichave at gmail.com Fri Aug 29 08:57:05 2008 From: lichave at gmail.com (Konrad Jankowski) Date: Fri Aug 29 08:57:12 2008 Subject: Unicode-based FreeBSD In-Reply-To: <1219868153.6962.37.camel@habakkuk.aloha.tallye.com> References: <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com> <1219868153.6962.37.camel@habakkuk.aloha.tallye.com> Message-ID: <716a8d5f0808290157v6f561908r98ec10b1ca09aa9e@mail.gmail.com> On Wed, Aug 27, 2008 at 10:15 PM, Loren M. Lang wrote: > The third area that FreeBSD might need some improvement is in libc. I > am less familiar with this area so my information may be incorrect. > Basic locale and Unicode support exists in libc, but more advanced > functionality like character classes and collating needs work. The > commands mklocale and colldef are used to create the appropriate binary > data files from source and, if I remember correctly, used a format which > is too simplified to fully support a modern Unicode specification. The collation and colldef are being worked on right now, and the work is almost finished. Link to my wiki with project status, if somebody is interested: http://wiki.freebsd.org/KonradJankowski/Collation From lichave at gmail.com Fri Aug 29 09:05:07 2008 From: lichave at gmail.com (Konrad Jankowski) Date: Fri Aug 29 09:05:14 2008 Subject: Adding or editing a locale In-Reply-To: <20080828194209.GA44154@phat.za.net> References: <20080828003426.GB57611@phat.za.net> <716a8d5f0808281211s73f753bs557b3ac4446410ed@mail.gmail.com> <20080828192652.GB40752@phat.za.net> <20080828194209.GA44154@phat.za.net> Message-ID: <716a8d5f0808290205v53257632qa7ceb8bc3dcd3e8b@mail.gmail.com> On Thu, Aug 28, 2008 at 9:42 PM, Aragon Gouveia wrote: > Ok, I just jumped into LC_TIME and timelocal.* - it was a lot easier than I > thought it'd be. Cool! > > I'm from South Africa. I'd like to submit an en_ZA and a fix to the > existing af_ZA (which has errors). Can I just submit a diff via the PR > system? Yeah, that's the correct way. Than the appropriate committer will pick it up. I'm CC'ing Diomidis, who might help on getting it committed. From lorenl at north-winds.org Fri Aug 29 17:40:37 2008 From: lorenl at north-winds.org (Loren M. Lang) Date: Fri Aug 29 17:40:49 2008 Subject: Unicode-based FreeBSD In-Reply-To: <200808241415.31812.mitchell@wyatt672earp.force9.co.uk> References: <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com> <200808241415.31812.mitchell@wyatt672earp.force9.co.uk> Message-ID: <1220031632.7224.66.camel@habakkuk.aloha.tallye.com> Hmm, I seem to have sent this from the wrong identity, take two: On Sun, 2008-08-24 at 14:15 +0100, Frank wrote: > Even if you use an English locale with occasional accented letters, you might > want ISO-8859-1 for legacy compatibility. Also I multiboot, sharing a Data > Partition with other Unix flavours using ISO-8859-1. And I need to import > previous tracks during Multisession CD/DVD Archive/Backup operations. And > naturally I have legacy documents in ISO-8859-1, which corresponds to my > old Windows Codepage 1252. As far other Unices, all modern ones support a full Unicode environment and I am just lucky enough that all mine are a recent enough install that I've been able to use UTF-8 since install. The filesystems all use UTF-8 for filenames and documents and are compatible across each other. Any CD-ROM using Microsoft's Joliet extensions for long filenames use Unicode as there internal encoding and FreeBSD has to translate that to the local encoding to display them properly, though, I am not sure if FreeBSD currently supports converting to UTF-8. > > I've heard that Japanese and Chinese users prefer their own coding systems, > because the Unicode Character Set in these languages is limited. Korean also > has Combining Characters, and UTF-8 comes in 3 different Levels depending on > its ability to cope with this. Maybe you need some contacts in other > countries. Actually, China's official character set is GB18030. GB18030 is fully backward compatible to their old character set, GB2312, but contains an identical set of characters as is in Unicode. It's basically their version of UTF-8. > > Faictz Ce Que Vouldras: Frank Mitchell > > On Saturday 23 August 2008 01:00:28 Alexander Churanov wrote: > > > > I am interested in FreeBSD internationalization and unicode support. I > > already spent some time examining the source of syscons. I think that > > syscons is the main problem in bringing full UTF-8 support to FreeBSD out > > of box. It seems that I am ready with the solution. That's why I am writing > > to this list. > > > > 0) Is moving to UTF-8 from 8-bit codepages desired for FreeBSD? > > > _______________________________________________ > freebsd-i18n@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-i18n > To unsubscribe, send any mail to "freebsd-i18n-unsubscribe@freebsd.org" > -- Loren M. Lang lorenl@north-winds.org http://www.north-winds.org/ Public Key: ftp://ftp.north-winds.org/pub/lorenl_pubkey.asc Fingerprint: 10A0 7AE2 DAF5 4780 888A 3FA4 DCEE BB39 7654 DE5B -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-i18n/attachments/20080829/e9f97309/attachment.pgp From lorenl at north-winds.org Fri Aug 29 17:43:25 2008 From: lorenl at north-winds.org (Loren M. Lang) Date: Fri Aug 29 17:43:36 2008 Subject: Adding or editing a locale In-Reply-To: <716a8d5f0808290205v53257632qa7ceb8bc3dcd3e8b@mail.gmail.com> References: <20080828003426.GB57611@phat.za.net> <716a8d5f0808281211s73f753bs557b3ac4446410ed@mail.gmail.com> <20080828192652.GB40752@phat.za.net> <20080828194209.GA44154@phat.za.net> <716a8d5f0808290205v53257632qa7ceb8bc3dcd3e8b@mail.gmail.com> Message-ID: <1220031798.7224.69.camel@habakkuk.aloha.tallye.com> Hmm, I seem to have sent this from the wrong identity, take two: On Fri, 2008-08-29 at 11:05 +0200, Konrad Jankowski wrote: > On Thu, Aug 28, 2008 at 9:42 PM, Aragon Gouveia wrote: > > Ok, I just jumped into LC_TIME and timelocal.* - it was a lot easier than I > > thought it'd be. Cool! > > > > I'm from South Africa. I'd like to submit an en_ZA and a fix to the > > existing af_ZA (which has errors). Can I just submit a diff via the PR > > system? Hmm, I'm curious what happens when a less common locale such as en_ZA is used and a certain application does not, for example, it has no translations for en_ZA. As I understand it, it will default to trying simply en as a locale, but is en equivalent to en_US or en_GB which are two locales applications are more likely to implement. Since a considerable amount of the software is written in the US, I'd expect that for some applications en would translate to en_US even though en_GB is a more appropriate substitute for en_ZA. (Unless I'm mistaken, I have only been to South Africa once.) How exactly do fallbacks work? _______________________________________________ > freebsd-i18n@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-i18n > To unsubscribe, send any mail to "freebsd-i18n-unsubscribe@freebsd.org" > -- Loren M. Lang lorenl@north-winds.org http://www.north-winds.org/ Public Key: ftp://ftp.north-winds.org/pub/lorenl_pubkey.asc Fingerprint: 10A0 7AE2 DAF5 4780 888A 3FA4 DCEE BB39 7654 DE5B -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-i18n/attachments/20080829/0f34c3ee/attachment.pgp From fbsd at opal.com Fri Aug 29 18:42:31 2008 From: fbsd at opal.com (J.R. Oldroyd) Date: Fri Aug 29 18:43:06 2008 Subject: Unicode-based FreeBSD In-Reply-To: <716a8d5f0808290157v6f561908r98ec10b1ca09aa9e@mail.gmail.com> References: <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com> <1219868153.6962.37.camel@habakkuk.aloha.tallye.com> <716a8d5f0808290157v6f561908r98ec10b1ca09aa9e@mail.gmail.com> Message-ID: <20080829141302.08dcbf4f@vougeot> On Fri, 29 Aug 2008 10:57:04 +0200, "Konrad Jankowski" wrote: > > The collation and colldef are being worked on right now, and the work > is almost finished. Link to my wiki with project status, if somebody > is interested: > http://wiki.freebsd.org/KonradJankowski/Collation > Some two years ago, I put together a document describing how to run FreeBSD under UTF-8 now. It is still available here: http://opal.com/jr/freebsd/unicode/ It contains notes about what is needed to make the base system and various tools and ports work with UTF-8. Some of the ports info may need updating, but it is still more-or-less accurate. Based on discussion on this list and -current at the time, there is also a roadmap documenting what would be needed to before FreeBSD can be switched. Konrad's collation work is one of the major items on that list. -jr -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-i18n/attachments/20080829/c2f80e66/signature.pgp From aragon at phat.za.net Fri Aug 29 23:01:07 2008 From: aragon at phat.za.net (Aragon Gouveia) Date: Fri Aug 29 23:01:15 2008 Subject: Adding or editing a locale In-Reply-To: <1220031798.7224.69.camel@habakkuk.aloha.tallye.com> References: <20080828003426.GB57611@phat.za.net> <716a8d5f0808281211s73f753bs557b3ac4446410ed@mail.gmail.com> <20080828192652.GB40752@phat.za.net> <20080828194209.GA44154@phat.za.net> <716a8d5f0808290205v53257632qa7ceb8bc3dcd3e8b@mail.gmail.com> <1220031798.7224.69.camel@habakkuk.aloha.tallye.com> Message-ID: <20080829230105.GA54406@phat.za.net> | By Loren M. Lang | [ 2008-08-29 19:43 +0200 ] > Hmm, I'm curious what happens when a less common locale such as en_ZA is > used and a certain application does not, for example, it has no > translations for en_ZA. As I understand it, it will default to trying > simply en as a locale, but is en equivalent to en_US or en_GB which are > two locales applications are more likely to implement. Since a > considerable amount of the software is written in the US, I'd expect > that for some applications en would translate to en_US even though en_GB > is a more appropriate substitute for en_ZA. (Unless I'm mistaken, I > have only been to South Africa once.) How exactly do fallbacks work? I stand to be corrected, but I think if an application lacks translations for a locale it will fall back to the "C" locale (or rather, not change the locale from where it was). In the "C" locale and in terms of gettext (LC_MESSAGES) this means no translation will take place in application messages, so the message output will be in the language and dialect in which they were written in the application source itself. In terms of ctype, collation, etc., I would guess the C locale uses en_US symantics, but that is probably system dependent. As for en_ZA, you are correct - en_GB is a better substitute than en_US. Plain en I think should be prioritised as international english or any english in that order, so the best fit would be en_GB or any other en dialect which is supported by your app (in that order). That is just my opinion. setlocale() doesn't define a fallback mechanism in this case, so trying to call setlocale() for "en" on FreeBSD, for example, should return NULL and leave the locale unchanged. A fallback regime like you describe would need to be application defined. Regards, Aragon From alexanderchuranov at gmail.com Sat Aug 30 00:12:59 2008 From: alexanderchuranov at gmail.com (Alexander Churanov) Date: Sat Aug 30 00:13:06 2008 Subject: Unicode-based FreeBSD In-Reply-To: <20080829141302.08dcbf4f@vougeot> References: <3cb459ed0808221700w335b0906g6901d8b8bec4dad9@mail.gmail.com> <1219868153.6962.37.camel@habakkuk.aloha.tallye.com> <716a8d5f0808290157v6f561908r98ec10b1ca09aa9e@mail.gmail.com> <20080829141302.08dcbf4f@vougeot> Message-ID: <3cb459ed0808291712u74b9ef1m677a21d888f46abd@mail.gmail.com> 2008/8/29 J.R. Oldroyd > On Fri, 29 Aug 2008 10:57:04 +0200, "Konrad Jankowski" > wrote: > > http://wiki.freebsd.org/KonradJankowski/Collation > > Some two years ago, I put together a document describing how to run > FreeBSD under UTF-8 now. It is still available here: > > http://opal.com/jr/freebsd/unicode/ > > Thanks. I've read both documents. As I have previously assumed, syscons is a major bottleneck in moving to Unicode. The collation project is also very useful. However, they work on another part of Unicode support. This means that all efforts will add to the project simultaneously. Alexander Churanov