Subject:XSD validation scalability issue Author:Yitzhak Khabinsky Date:04 Dec 2012 10:57 AM Originally Posted: 04 Dec 2012 10:40 AM
Hello,
My environment:
SS X14 R2 Enterprise Suite build 1893h
OS: Windows 7 64-bit
RAM: 12 GB
Java: Java 7 update 9
Java Virtual machine parameters: -Xms8m -Xmx64m -Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8000
I am trying to run XSD validation for 118 MB file size XML file.
Both Saxonica 9.4.0.6 and Java built-in validators producing an error.
My next step was to comment out 99.99% of the XML and re-run the XSD validation. Alas, the same outcome XSD validation error.
It seems that the mentioned above XSD validators still loading even commented out nodes of the XML in the DOM.
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at com.stylusstudio.debugger.saxon.Validate.main(Validate.java:93)
Caused by: java.lang.OutOfMemoryError: Java heap space
at com.sun.org.apache.xerces.internal.util.XMLStringBuffer.append(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanData(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.scanComment(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanComment(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:405)
at net.sf.saxon.event.Sender.send(Sender.java:152)
at com.saxonica.Validate.processFile(Validate.java:514)
at com.saxonica.Validate.doValidate(Validate.java:347)
at com.saxonica.Validate.main(Validate.java:62)
... 5 more
P.S. I prefer to use Saxonica validator because of its detailed validation information
Subject:XSD validation scalability issue Author:Ivan Pedruzzi Date:20 Jan 2013 12:33 AM
The out of memory error was caused by a very large section of the XML document that was inside a comment. Saxon validates using a SAX filter which materializes comments as text nodes in memory.