Re: Grep with non-ascii

From: Tomoaki AOKI <junchoon_at_dec.sakura.ne.jp>
Date: Fri, 03 Feb 2023 11:39:48 UTC
On Fri, 3 Feb 2023 11:06:42 +0100
Eivind Nicolay Evensen <eivinde@terraplane.org> wrote:

> Hello.
> 
> I just noticed this today:
> 
> elg!ene[~]> printf "bø\nhei\nøl\n" | grep ø
> grep: trailing backslash (\)
> elg!ene[~]> echo $LC_CTYPE $LANG
> nb_NO.ISO8859-1 nb_NO.ISO8859-1
> 
> While I have the result I envisioned with gnugrep:
> 
> elg!ene[~]> printf "bø\nhei\nøl\n" | ggrep ø
> bø
> øl
> 
> Also, on OpenIndiana, linux and Netbsd, grep gives the proper result.
> 
> Is lib/libc/regex the right place to look into this if I
> find the time, or does anybody know this enough to know the
> problem?
> 
> Regards
> -- 
> Eivind Nicolay Evensen

Possibly a locale problem, or depending on what command line shell you
are using.

Tried copy/pasting to command line, I got the result below.

% printf "bø\nhei\nøl\n" | grep ø
bø
øl

I'm using LC_ALL=ja_JP.UTF-8, LANG=ja_JP.UTF-8 as locale and shells/zsh
as command line shell.

What happenes if you switch locale to nb_NO.UTF-8?

-- 
Tomoaki AOKI    <junchoon@dec.sakura.ne.jp>