Subject:XSD validation scalability issue Author:Yitzhak Khabinsky Date:04 Dec 2012 10:57 AM Originally Posted: 04 Dec 2012 10:40 AM
Hello,
My environment:
• SS X14 R2 Enterprise Suite build 1893h
• OS: Windows 7 64-bit
• RAM: 12 GB
• Java: Java 7 update 9
• Java Virtual machine parameters: -Xms8m -Xmx64m -Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8000
I am trying to run XSD validation for 118 MB file size XML file.
Both Saxonica 9.4.0.6 and Java built-in validators producing an error.
My next step was to comment out 99.99% of the XML and re-run the XSD validation. Alas, the same outcome – XSD validation error.
It seems that the mentioned above XSD validators still loading even commented out nodes of the XML in the DOM.
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at com.stylusstudio.debugger.saxon.Validate.main(Validate.java:93)
Caused by: java.lang.OutOfMemoryError: Java heap space
at com.sun.org.apache.xerces.internal.util.XMLStringBuffer.append(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanData(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.scanComment(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanComment(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:405)
at net.sf.saxon.event.Sender.send(Sender.java:152)
at com.saxonica.Validate.processFile(Validate.java:514)
at com.saxonica.Validate.doValidate(Validate.java:347)
at com.saxonica.Validate.main(Validate.java:62)
... 5 more
P.S. I prefer to use Saxonica validator because of its detailed validation information
Subject:XSD validation scalability issue Author:Ivan Pedruzzi Date:20 Jan 2013 12:33 AM
The out of memory error was caused by a very large section of the XML document that was inside a comment. Saxon validates using a SAX filter which materializes comments as text nodes in memory.