[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Convert an XML Schema validation task into a formthat is s

  • From: "Liam R. E. Quin" <liam@fromoldbooks.org>
  • To: Roger L Costello <costello@mitre.org>, "xml-dev@l..." <xml-dev@l...>
  • Date: Tue, 01 Sep 2020 23:27:54 -0400

Re:  Convert an XML Schema validation task into a formthat is s
On Tue, 2020-09-01 at 12:49 +0000, Roger L Costello wrote:
> Hi Folks,
> 
> I am reading a book [1] on machine learning and the book says some
> pretty interesting things:
> 
> "In the search for more speed, machine learning researchers started
> taking advantage of special hardware found in some computers,
> originally designed to improve graphics performance.

There have been papers from a group at Intel working on speeding up XML
processing using hardware. Oh - while i was wrting this i think Tony
Graham mentioned one.

See
https://www3.cs.stonybrook.edu/~mikepo/papers/gnort-regexp.raid09.pdf
for a paper on mplementting regular expression matching in GPU
hardware; this is at the core of XSD validation.  The speedup they
report in that paper is 60%, though, which is not huge.


I just validated a 43MByte XML file against its DTD using xmllint; it
took 0.4 seconds. Xerces-C took 0.184 seconds.

The word count program, wc, took 0.015 seconds:

$ time < with-sources.xml wc -l
647399

real	0m0.015s
user	0m0.009s
sys	0m0.006s

so validating with Xerces was about 10 times slower than just counting
the number of newline characters. The "rev" command, however, which
saves each line in a buffger and then reverses it, takes two seconds,
and sed -e s/girl/boy/g takes 0.14 seconds.

So the speed of parsing doesn't seem to be a huge problem. The xmllint
program builds an in-memory tree which probably accounts for the extra
time.

The expat xmlwf command takes about 0.16 seconds on the  same file; the
overhead of validating seems very small.

With the schema instead of a DTD it's only very slightly slower in
xmllint.


For a 60% speedup i'm not sure i'd be very interested, because of
potentially becoming tied to propretary graphics card software.

-- 
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.