[Home] [By Thread] [By Date] [Recent Entries]

Subject: Tokenize question: tokenize on words, spaces and punctuation
From: Martin Holmes <mholmes@xxxxxxx>
Date: Wed, 16 Mar 2011 17:33:43 -0700
Hi there,

This is really a question for XPath regex gurus:

I need to tokenize a string of text such that words, punctuation and spaces are split. So from this:

Oh, what a great day!

I need to get:

('Oh', ',', ' ', 'what', ' ', 'a', ' ', 'great', ' ', 'day', '!')

I've been hacking away at this for a while, but regexps aren't my strong suit. Can anyone help?

Cheers,
Martin

Current Thread
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member