Subject: Re: Transforming flat ?WordML? source to a hierarchical XML output.
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Tue, 11 Sep 2007 11:40:25 -0400
|
David,
If you show us your code (reduced to an illustration please) showing
what you've tried, it will be easier to help.
From what we can see, it appears your diagnosis could be correct. If
you're using group-adjacent="w:pPr/w:listPr/w:ilvl/@w:val", you could
try "(w:pPr/w:listPr/w:ilvl/@w:val,'0')[1]", which would provide '0'
as a grouping key value for w:p elements that return nothing from that XPath.
Hm: w:p, nice element name.
Cheers,
Wendell Piez
At 10:27 AM 9/11/2007, you wrote:
Using following:
Saxon XSLT processor, version 8.9
XSLT 2.0
I am trying to process XML source generated by Microsoft Word (WORDML).
WordML has no concept of hierarchy, and so each paragraph in the source
looks like below:
<w:p>
<w:pPr>
<w:pStyle w:val="Normal"/>
</w:pPr>
<w:r>
<w:rPr/>
<w:t>Normal Paragraph</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="Number"/>
<w:listPr>
<w:ilvl w:val="0"/>
</w:listPr>
</w:pPr>
<w:r>
<w:rPr/>
<w:t>Top Level List</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="Number"/>
<w:listPr>
<w:ilvl w:val="0"/>
</w:listPr>
</w:pPr>
<w:r>
<w:rPr/>
<w:t>Top Level List</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="Bulleted"/>
<w:listPr>
<w:ilvl w:val="1"/>
</w:listPr>
</w:pPr>
<w:r>
<w:rPr/>
<w:t>Nested List Level 1</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="Bulleted"/>
<w:listPr>
<w:ilvl w:val="1"/>
</w:listPr>
</w:pPr>
<w:r>
<w:rPr/>
<w:t>Nested List Level 1</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="Number"/>
<w:listPr>
<w:ilvl w:val="2"/>
</w:listPr>
</w:pPr>
<w:r>
<w:rPr/>
<w:t>Nested List Level 2</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="Number"/>
<w:listPr>
<w:ilvl w:val="3"/>
</w:listPr>
</w:pPr>
<w:r>
<w:rPr/>
<w:t>Nested List Level 3</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="Number"/>
<w:listPr>
<w:ilvl w:val="4"/>
</w:listPr>
</w:pPr>
<w:r>
<w:rPr/>
<w:t>Nested List Level 4</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="Number"/>
<w:listPr>
<w:ilvl w:val="4"/>
</w:listPr>
</w:pPr>
<w:r>
<w:rPr>
<w:i/>
</w:rPr>
<w:t>Nested List Level 4</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="Number"/>
<w:listPr>
<w:ilvl w:val="5"/>
</w:listPr>
</w:pPr>
<w:r>
<w:rPr>
<w:b/>
</w:rPr>
<w:t>Nested List Level 5</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="Number"/>
<w:listPr>
<w:ilvl w:val="5"/>
</w:listPr>
</w:pPr>
<w:r>
<w:rPr>
<w:u w:val="single"/>
</w:rPr>
<w:t>Nested List Level 5</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="Normal"/>
</w:pPr>
<w:r>
<w:rPr/>
<w:t>Normal Paragraph</w:t>
</w:r>
</w:p>
This displays in word as follows:
Normal Paragraph
1. Top Level List
2. Top Level List
* Nested List Level 1
* Nested List Level 1
1. Nested List Level 2
a. Nested List Level 3
i. Nested List Level 4
ii. Nested List Level 4
1. Nested List Level 5
2. Nested List Level 5
Normal Paragraph
I need the outcome to be as follows:
<Paragraph>Normal Paragraph</Paragraph>
<List type="numbered">
<Item>Top Level List</Item>
<Item>Top Level List
<List type="bulleted">
<Item>Nested List Level 1</Item>
<Item>Nested List Level 1
<List type="numbered">
<Item>Nested List Level 2
<List type="
numbered">
<Item>
Nested List Level 3
<
List type="numbered">
<Item>Nested List Level 4</Item>
<Item>Nested List Level 4
<List type="numbered">
<Item>Nested List Level 5</Item>
<Item>Nested List Level 5</Item>
</List>
</Item>
</
List>
</Item>
</List>
</Item>
</List>
</Item>
</List>
</Item>
</List>
<Paragraph>Normal Paragraph</Paragraph>
I think what is required is a grouping procedure, grouping the paragraphs
depending on the value of x-path 'w:pPr/w:listPr/w:ilvl/@w:val' for each
paragraph.
My attempt to do this has been unsuccessful resulting in problems of not
all paragraphs having the x-path 'w:pPr/w:listPr/w:ilvl/@w:val' and
therefore the grouping falls over.
I hope you can help me in this matter, thank you for reading.
======================================================================
Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
----------------------------------------------------------------------
Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================
|