Given an arbitrary, valid XML document,
is it at all reasonable to use XSLT to
analyse the document and make a fair
guess at which elements are containers,
which are atomic etc?
e.g. The sort of logic I'm getting at
might be:
Total element count=100
element x contains 80 descendants
none of which hold pcdata,
hence its likely to be a container.
element y is atomic and first child of
element x, hence likely to be a title.
element 'para' occurs 38 times with
PCDATA content, hence likely to
be a paragraph. I'm loathe to use
element names for other than basics.
Unknowns might be reported as:
Element z has only 3 children,
a 'candidate' container:
Overall objective would be to
do an initial analysis along these
lines, then manually finish off,
to permit transormation into HTML.
I'm hoping that a stylesheet might
break the back of the work before
handing off to a human.
Any feedback appreciated.
regards, DaveP
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
|