[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: XSLT1.0 and wildcards

Subject: Re: XSLT1.0 and wildcards
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Tue, 03 Oct 2006 10:21:27 +0200
Re:  XSLT1.0 and wildcards
Pankaj Bishnoi wrote:
For example The address line is like this: melkweg 51a.

This means I have to map this like this:  street = melkweg, number = 51,
extension = a
Are there such wildcards, and/or is there a better way to do this?

Hi Pankaj,


Yes, there's a better way. As a matter of fact, this happens to be a specialized field of science. Depending on your needs, there are numerous ways to resolve this. Since you appear to live in Holland and the address exists in Amsterdam, consider the following address lines:

1) Melkweg 51 A
2) Plein 40-45 123-IV
3) 1ste J vd Heijdenstraat 12-hs

ad 1) this is a common extension
ad 2) street is "Plein 40-45", nr is "123", suffix (floor) is "IV"
ad 3) suffix 'hs' means 'huis' means ground floor in Holland. Note the number in the streetname.


Perhaps you'd thought of all this already. International addresses pose even more challenges: the French and the English place their streetnumbers as ordinals as start of the address line. I hope you won't have to deal with non-western characters or hebrew digits.... Hopefully nobody entered the postal code or city name on the same line ;-)

(that's why postal companies offer products to normalize the addresses to some well-known format. But beware, they offer about 95% matches, the rest will still dropout)

Now, for a solution with XSLT 1, it will be quite a challenge. I think you will have to pass the address line multiple times through the translate-filter that was proposed by Michael.

When you can resort to XSLT 2 or a filter before processing (like with client-side, you may be able to use javascript + regular expression to filter, on server side, you may use java/.net/perl/php + regular expression to filter your data), the regular expression may look like this (needs tweaking):

^(.*) ([0-9]+)([ -]?([a-zA-Z]))?$

$1 contains streetname
$2 contains number
$4 contains suffix (use $4 if you want it to include space or hyphen)

The regex will work for the above three examples (spaces are important in the regex).

Cheers,
-- Abel Braaksma

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.