[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

A utility to make msxsl more useful

  • From: Andrew Bunner <bunner@m...>
  • To: xml-dev@i...
  • Date: Fri, 04 Sep 1998 14:39:12 -0700

msxsl utility

  I wrote a small Perl script that can be used to preprocess XML files
before sending them to msxsl. Why might you want to do this? So you can
expand ENTITY references and do something like <INCLUDE
HREF="included_file.xml"/>

  It's very basic and very small so I just attached it to this message for
anyone who's interested.

  Here's the syntax for using it from the DOS command prompt...

C:\<your path to Perl>\Perl.exe expand.pl myfile.xml > temp.xml
msxsl -i myfile.xml -s myfile.xsl -o output.html

  myfile.xml can define entities in its internal and external DTD by saying
<!ENTITY entityname 'VALUE'> or <!ENTITY entityname SYSTEM 'filepath'> You
can use single or double quotes.

  I also made it so you can include a file by saying <INCLUDE
HREF="filetoinclude"/>

  Basically, I'm trying to find ways to make msxsl usable now. I was sort
of hoping some Java programmers would leap to the rescue and turn msxml (or
some equivalent parser) into type of preprocessor for msxsl but, failing
that, I worked up a quick and dirty way to do what I want. Hopefully some
one else will find it useful.


main();

sub main {
	$xml = (&readFile($ARGV[0]));
    %externalEntities = &parseExternalDTD($xml);
    %internalEntities = &parseInternalDTD($xml);
    my($moreToGo) = (1);
    while ($moreToGo) {
    	$moreToGo = &expandEntities(%externalEntities, %internalEntities) | &expandLinks(%externalEntities, %internalEntities);
	}
    print $xml;
}

# $_[0] = file name or path
# returns full text of file
sub readFile {
	my($contents);
	my(@fileInfo) = stat($_[0]);
	open(F, $_[0]) or die "Couldn't open $_[0]\n";
	read F, $contents, $fileInfo[7];
	close(F);
    return $contents;
}

# $_[0] full text of an XML document
# returns hash of external entities and what they reference
sub parseExternalDTD {
	# Looking for...  <!DOCTYPE foo SYSTEM 'bar.dtd'>
	unless ($_[0] =~ /<!DOCTYPE\s+\w+\s+SYSTEM\s+['"]([^"']+)/) {
    	return {};
    }
    my($dtdPath) = ($1);
    my($dtd) = &readFile($dtdPath);
    my(%entities) = (&extractEntities($dtd));
    return %entities;
}

# $_[0] full text of XML document
# returns hash of internally defined entities and what they reference
sub parseInternalDTD {
	my(%entities) = (&extractEntities($_[0]));
    return %entities;
}

# $_[0] text, possibly containing <!ENTITY> declarations
# returns entity has of names and values
sub extractEntities {
	my($text) = $_[0];
	my(%entities);
    my($entityName, $entityPath);
    # Looking for <!ENTITY foo 'bar'> or <!ENTITY foo SYSTEM 'bar'>
    while ($text =~ /<!ENTITY/) {
    	if ($text =~ s/<!ENTITY\s+(\w+)\s+['"]([^'"]*)['"]>//s) {
        	$entities{$1} = $2;
		} elsif ($text =~ s/<!ENTITY\s+(\w+)\s+SYSTEM\s+['"]([^'"]+)['"]>//s) {
        	($entityName, $entityPath) = ($1, $2);
            $entities{$entityName} = &readFile($entityPath);
		}
	}
    return %entities;
}

# @_ is a hash of entities and what they expand to
# works on global variable $xml searching for &foo; references
# returns true if it was able to make any replacements
sub expandEntities {
	my(%entities) = @_;
    my($gotOne) = (0);
    while ($xml =~ s/\&(\w+);/$entities{$1}/) {
    	$gotOne = 1;
    }
    return $gotOne;
}

sub expandLinks {
	my($gotOne) = (0);
	# We're looking for... <INCLUDE HREF="foo"/>
    # This is not a complete implementation! A real XML processor would
    # look for any type of link that's defined to have SHOW="EMBED" and ACTUATE="AUTO"
    # ...but that's too much work for what I'm after
    while ($xml =~ s/<INCLUDE\s+HREF=["']([^"']+)["']\/>/&readFile($1)/se) {
    	$gotOne = 1;
	}
    return $gotOne;
}

-- Andrew

   Andrew Bunner
   President, Founder Mass Quantities, Inc.
   Professional Supplements for the Perfect Physique
   http://www.massquantities.com 

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.