Pattern Matching

You can use regular expressions to specify match patterns for the rows in the input file. Only those rows in the input file that match the pattern you specify are output to XML when the file is converted. The simplest way to define a match pattern is to use the Match Pattern property in the Row Element Name section of the Properties window.

Example

Consider the following input file:

Make,Model,Year,Mileage
               
BMW,R1150RS,2004,14274  
               
Kawasaki,GPz1100,1996,60234  
               
Ducati,ST2,1997,24000  
               
Moto Guzzi,LeMans,2001,12393  
               
BMW,R1150R,2002,17439  
               
Ducati,Monster,2000,15682  
               
Aprilia,Futura,2001,17320 
               

            

If you specify a simple regular expression, say, ^B, for the Match Pattern property, Stylus Studio displays the input file in the Convert to XML Editor as shown in Figure 188 - green check marks identify the rows that match the pattern, and red X's identify the rows that do not. (You can also display matching rows in a contrasting color by clicking the Highlight Matching Rows button. See Document Pane Display Features for more information about this feature.)

Figure 188. Match Pattern - Definition and Display

Note that the match pattern also appears as a new node in the schema pane. This new node, the only one defined for this adapter at this point, uses the default row element name ( row) and the value of the expression.

Since the match pattern selects only those rows that begin with the letter B, the adapter creates the following XML document when it is run against the input file:

<?xml version="1.0" encoding="utf-8"?>
               
<root>
               
	<row>
               
		<Make>BMW</Make>
               
		<Model>R1150RS</Model>
               
		<Year>2004</Year>
               
		<Mileage>14274</Mileage>
               
	</row>
               
	<row>
               
		<Make>BMW</Make>
               
		<Model>R1150R</Model>
               
		<Year>2002</Year>
               
		<Mileage>17439</Mileage>
               
	</row>
               

            

See Working with Nodes to learn about adding the row element name/match pattern pairs that define them.

Sample Regular Expressions

The following table presents some commonly used regular expressions.

Expression
Matches
^ABC
Match all lines starting with "ABC"
^[Aa][Bb][Cc]
Match all lines starting "ABC", "abc" or any mix of upper and lowercase ("Abc", for example)
AAA
Match all lines containing "AAA"
^(DEF | GHI)
Match all lines starting with "DEF" or "GHI"
XYZ$
Match all lines ending with "XYZ"
XYZ\$
Match all lines containing "XYZ"

Go to http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html to learn about the specific regular expression implementation supported in Stylus Studio. Go to http://www.boost.org/libs/regex/doc/syntax.html for additional examples of regular expression usage.

Specifying Multiple Match Patterns

You can specify multiple match patterns for a single file. If we define a new match pattern, ^K, this results in a new node ( <row> "^K") in the schema pane, which now displays both nodes (see Figure 189). When an input file is converted, Stylus Studio matches the patterns in the order in which the nodes that represent them are defined in the schema. Blank patterns are always matched last.

When you define multiple match patterns, the document pane displays a gray square alongside rows that match a pattern other than the one, if any, associated with the currently selected row. In Figure 189, for example, row 3 is the currently selected row; it matches the pattern ^K we have defined. Because row 3 is the active row, Stylus Studio displays gray squares in rows 2 and 6 (which match the pattern B defined previously).

Figure 189. Gray Squares Identify Rows That Match Other Patterns

Working with Nodes

In addition to defining nodes using the Match Pattern field of the Properties window, you can also use the Set Node and Match Pattern dialog box, shown here:

Figure 190. Set Node and Match Pattern Dialog Box

This dialog box allows you to

  • Define a new node - even one that does not match a row in the current input file. For example, we could define a match pattern for Triumph motorcycles ( <row> "^T") even though there are no Triumph motorcycles in the input file.
  • Clone an existing node - this allows you to copy an existing node and modify its match pattern to create a new node.
  • Edit an existing node. (You can also do this in the Properties window, of course.)

When you open the dialog box, the Row Element Name and Match Pattern fields contain default values that reflect the currently selected row in the document pane or node in the schema pane.

Defining a New Node
To define a new node:
1. Select a row in the document pane or a node in the schema pane.
2. Select ConvertToXML > Add Node and Pattern from the Stylus Studio menu.

Alternative: Select Add Node and Pattern from the document pane or schema pane shortcut menu.

The Set Node and Pattern dialog box appears.

3. Change the default values in the Row Element Name and Match Pattern fields.
4. Click OK.
Cloning a Node
To clone a node:
1. Select the node in the schema pane that you want to clone.

Alternative: Select the row in the document pane that is represented by a row element name/match pattern pair you want to clone.

2. Select ConvertToXML > Clone Node and Pattern from the Stylus Studio menu.

Alternative: Select Clone Node and Pattern from the document pane or schema pane shortcut menu.

The Set Node and Pattern dialog box appears.

3. Change the default values in the Row Element Name and Match Pattern fields as needed.
4. Click OK.
Editing a Node
To edit a node:
1. Select the node in the schema pane that you want to edit.

Alternative: Select the row in the document pane that is represented by a row element name/match pattern pair you want to edit.

2. Select ConvertToXML > Edit Node and Pattern from the Stylus Studio menu.

Alternative: Select Edit Node and Pattern from the document pane or schema pane shortcut menu.

Alternative: Double-click the node.

The Set Node and Pattern dialog box appears.

3. Change the default values in the Row Element Name and Match Pattern fields as needed.
4. Click OK.
Removing a Node

When you remove a node, you are deleting the row element name/match pattern pair from the adapter you are defining.

To remove a node:
1. Select the node in the schema pane that you want to remove.

Alternative: Select the row in the document pane that is represented by a row element name/match pattern pair you want to remove.

2. Select ConvertToXML > Remove Node and Pattern from the Stylus Studio menu.

Alternative: Select Remove Node and Pattern from the document pane or schema pane shortcut menu.

Alternative: Press the Delete key.

A warning message appears.

3. Click Yes to remove the node, otherwise click No.
 
Free Stylus Studio XML Training:
W3C Member