[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RELAX NG and datatypes make XML generation NP-hard

  • To: <xml-dev@l...>
  • Subject: RELAX NG and datatypes make XML generation NP-hard
  • From: "Robert C. Lyons" <boblyons@u...>
  • Date: Thu, 3 Apr 2003 08:31:21 -0500
  • Cc: <ht@c...>
  • Importance: Normal
  • In-reply-to: <f5badffqhjk.fsf@e...>
  • Reply-to: <boblyons@u...>

3sat problem
After reading Henry's posting below, I realized
that the following problem is also NP-hard:

  Given an arbitrary RELAX NG schema, which is
  allowed to contain XML Schema Datatypes, is
  there an XML document that conforms to the
  schema?

Here's the proof:

The 3SAT problem is NP-complete. Any 3SAT problem
can be encoded into a RELAX NG schema, such that
an XML document contains a solution to the 3SAT
problem if and only if the document conforms to
the RELAX NG schema. Thus, solving the XML
generation problem (for RELAX NG with XML Schema
Datatypes) is at least as hard as solving 3SAT.

For example, consider the following 3SAT problem:

  ( x1 OR x2 OR x3 ) AND
  ( x2 OR NOT(x3) OR NOT(x4) ) AND
  ( x3 OR x4 OR NOT(x1) )

This 3SAT problem can be encoded as the following
RELAX NG schema:

<?xml version="1.0" encoding="UTF-8"?>
<element name="three_sat_solution"
         xmlns="http://relaxng.org/ns/structure/1.0"
         datatypeLibrary=
           "http://www.w3.org/2001/XMLSchema-datatypes">
    <data type="string">
        <param name="pattern">x1=[01],x2=[01],x3=[01],x4=[01]</param>
        <param name="pattern">(.*x1=1.*)|(.*x2=1.*)|(.*x3=1.*)</param>
        <param name="pattern">(.*x2=1.*)|(.*x3=0.*)|(.*x4=0.*)</param>
        <param name="pattern">(.*x3=1.*)|(.*x4=1.*)|(.*x1=0.*)</param>
    </data>
</element>

The following XML document conforms to this RELAX NG schema
and solves the 3SAT problem:

<?xml version="1.0" encoding="UTF-8"?>
<three_sat_solution>x1=1,x2=0,x3=0,x4=1</three_sat_solution>

The following XML document does not conform to the RELAX NG schema
and does not solve the 3SAT problem:

<?xml version="1.0" encoding="UTF-8"?>
<three_sat_solution>x1=1,x2=1,x3=0,x4=0</three_sat_solution>

For more information on 3SAT (a.k.a., 3-CNF), see
http://www.wikipedia.org/wiki/Boolean_satisfiability_problem.

Since RELAX NG has a solid mathematical foundation, I suspect that the
XML generation problem can be solved in linear time when the problem is
restricted to RELAX NG schemas that do not use W3C XML Schema Datatypes.

Best regards,

Bob

<sig name    = 'Bob Lyons'
     title   = 'XML and Java Consultant'
     company = 'Unidex, Inc.'
     phone   = '+1-732-975-9877'
     email   = 'boblyons@u...'
     url     = 'http://www.unidex.com/'
     product = 'XML Convert: transforms flat files to XML and vice versa' />

-----Original Message-----
From: Henry S. Thompson [mailto:ht@c...]
Sent: Friday, March 28, 2003 4:59 PM
To: xml-dev@l...
Subject: ID/IDREF makes XML generation NP-hard


Somewhat surprisingly, it turns out that answering the question, for an
arbitrary XML DTD, "Are there any valid instances of the document type
defined by this DTD?", is an NP-hard problem.

This is true, because it is possible to encode a 3SAT problem [1] in a
DTD, so that there is an instance of the DTD's document type iff the
problem can be satisfied.  So finding a valid instance is as hard a
problem as solving the 3SAT problem, and 3SAT is known to be NP-hard.

[snip]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.