[Home] [By Thread] [By Date] [Recent Entries]
Somewhat surprisingly, it turns out that answering the question, for an
arbitrary XML DTD, "Are there any valid instances of the document type
defined by this DTD?", is an NP-hard problem.
This is true, because it is possible to encode a 3SAT problem [1] in a
DTD, so that there is an instance of the DTD's document type iff the
problem can be satisfied. So finding a valid instance is as hard a
problem as solving the 3SAT problem, and 3SAT is known to be NP-hard.
Here's the DTD for a 4-variable, 3-clause example. The 'clauses'
element is the encoding of the 3SAT problem
(v1 | -v2 | v3) & (-v1 | v3 | v4 ) & (v2 | -v3 | -v4)
The complexity arises in finding a set of values for the 'value' IDs
of the children of <bindings> which are consistent with some sequence of
choices from the <clauses>.
The choices interspersed among the value bindings ensure that every
variable gets exactly one binding, which in turn makes it impossible
to choose <v2n> for one clause and <v2y> for the other, without
causing an IDREF failure.
<!ELEMENT root (bindings, clauses)>
<!ELEMENT bindings (v1, (v1y|v1n) , v2, (v2y|v2n),
v3, (v3y|v3n) , v4, (v4y|v4n))>
<!ELEMENT clauses (( v1y | v2n | v3y ) ,
( v1n | v3y | v4y ) ,
( v3n | v4n | v2y ))>
<!ELEMENT v1 EMPTY>
<!ATTLIST v1 value ID #REQUIRED>
<!ELEMENT v2 EMPTY>
<!ATTLIST v2 value ID #REQUIRED>
<!ELEMENT v3 EMPTY>
<!ATTLIST v3 value ID #REQUIRED>
<!ELEMENT v4 EMPTY>
<!ATTLIST v4 value ID #REQUIRED>
<!ELEMENT v1y EMPTY>
<!ATTLIST v1y value IDREF #FIXED "v1true">
<!ELEMENT v1n EMPTY>
<!ATTLIST v1n value IDREF #FIXED "v1false">
<!ELEMENT v2y EMPTY>
<!ATTLIST v2y value IDREF #FIXED "v2true">
<!ELEMENT v2n EMPTY>
<!ATTLIST v2n value IDREF #FIXED "v2false">
<!ELEMENT v3y EMPTY>
<!ATTLIST v3y value IDREF #FIXED "v3true">
<!ELEMENT v3n EMPTY>
<!ATTLIST v3n value IDREF #FIXED "v3false">
<!ELEMENT v4y EMPTY>
<!ATTLIST v4y value IDREF #FIXED "v4true">
<!ELEMENT v4n EMPTY>
<!ATTLIST v4n value IDREF #FIXED "v4false">
There is at least one solution:
<!DOCTYPE root SYSTEM "3sat.dtd">
<root>
<bindings>
<v1 value="v1false"/><v1n/>
<v2 value="v2false"/><v2n/>
<v3 value="v3false"/><v3n/>
<v4 value="v4false"/><v4n/>
</bindings>
<clauses>
<v2n/>
<v1n/>
<v3n/>
</clauses>
</root>
Henry S. Thompson
Richard Tobin
[1] http://www.wikipedia.org/siki/Satisfiability_problem
--
Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
Half-time member of W3C Team
2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
Fax: (44) 131 650-4587, e-mail: ht@c...
URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
|

Cart



