ot: regular expression help

Matthew Seaman m.seaman at infracaninophile.co.uk
Tue Jul 7 14:49:42 UTC 2009


Aryeh M. Friedman wrote:
> I am attempting to make (without the perl expansions) a regular 
> expansion that when used as a delim will split words on any 
> punction/whitespace character *EXCEPT* "$" (for java people I want to 
> feed it into something like this:
> 
> for(String foo:input.split([insert regex here])
>    ...

Well, there's no way to say "all foo except bar" using standard regexes, so
you can't use the [:punct:] character class. You'll have to roll your own
class.

If your input is ASCII then see ispunct(3) for a handy list of all the
ascii punctuation characters.  I guess you'll need a RE something like this:

   []!"#%&'\(\)\*\+,\./:;<=>?@[\\^_`{\|}~-[:space:]]+

although that's completely untried, quite likely to not have all the
metacharacters properly escaped (exactly what is or isn't a metacharacter
depends on the RE implementation you're using) and is probably horribly
confused due to the inclusion of '[' '-' and ']' amongst the characters
matched in the range.  

If you're using anything other than ascii, then I suspect you're going
to have problems with RE libs anyhow, unless you can somehow use PCRE.  
The \p{isPunct} and \p{isWhite} escapes for matching unicode punctuation
or whitespace is probably what you need.

Even so, your best choice would probably be to separately check strings
for the presence of $ characters -- maybe transform those $ characters to
something else -- and then split on any remaining punctuation characters.

	Cheers,

	Matthew

-- 
Dr Matthew J Seaman MA, D.Phil.                   7 Priory Courtyard
                                                  Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey     Ramsgate
                                                  Kent, CT11 9PW

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: OpenPGP digital signature
Url : http://lists.freebsd.org/pipermail/freebsd-questions/attachments/20090707/d9896e37/signature.pgp


More information about the freebsd-questions mailing list