grep and anchoring
Eir Nym
eirnym at gmail.com
Sun Jun 26 14:41:52 UTC 2016
> On 26 Jun 2016, at 16:34, Polytropon <freebsd at edvax.de> wrote:
>
> On Sun, 26 Jun 2016 15:10:57 +0200, Daniël de Kok wrote:
>> Dear all,
>>
>> After a BSD hiatus of many years, I am tinkering with FreeBSD again.
>> I’ve run into some strange issue with grep and beginning of line (^)
>> anchoring:
>>
>> —
>> % echo "1234 1234 1234" | egrep -o '^….'
>> 1234
>> 123
>> 4 12
>> % echo "123412341234" | egrep -o '^....'
>> 1234
>> 1234
>> 1234
>> —
>>
>> Any idea what is going on here?
>
> I think what you see here is a typical "UTF-8 fsck-up".
> The first search pattern contains a an ellipsis ("…",
> 2 bytes long, representing 3 characters), and a single
> dot (".", one byte long, 1 character); the second pattern
> contains four dots (4 x ".", 1 byte long, 1 character).
> Of course grep interprets "…" and "..." differently.
> In my mailer, I can see the difference clearly as the
> ellipsis … is displayed in monospace font as a _one_
> character wide symbol on the screen.
>
I think this was automatic spell correction and he mentioned 4 dot symbols (.), not a ‘…' and ‘.’
> Or is this just an "enrichment" your MUA added? :-)
>
> I'm quite sure you run into similar problems when you
> include ligatures (like st, ft, ffi, ck or the like)
> or one of the many different hyphend and spaces in a
> search pattern. :-)
>
> Otherwise, your example seems to show the expected
> behaviour.
>
> % echo "1234 1234 1234" | egrep -o '^....'
> 1234
> 123
> 4 12
>
> % echo "123412341234" | egrep -o '^....'
> 1234
> 1234
> 1234
>
> First 4-character pattern is "1234", next is " 123",
> and last is "4 12" (each 4 characters wide, as the
> space character " " is also "any character" that matches
> the . pattern). In the second example, the groups match
> 4 characters each ("1234" x 3).
>
> What different results did you expect? Or am I misinterpreting
> your question?
>
>
> --
> Polytropon
> Magdeburg, Germany
> Happy FreeBSD user since 4.0
> Andra moi ennepe, Mousa, ...
> _______________________________________________
> freebsd-questions at freebsd.org <mailto:freebsd-questions at freebsd.org> mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-questions <https://lists.freebsd.org/mailman/listinfo/freebsd-questions>
> To unsubscribe, send any mail to "freebsd-questions-unsubscribe at freebsd.org <mailto:freebsd-questions-unsubscribe at freebsd.org>"
More information about the freebsd-questions
mailing list