Subject: RE: Transforming flat ?WordML? source to a hierarchical XML output.
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Wed, 12 Sep 2007 11:38:59 +0100
|
There's an example of XSLT 2.0 code for converting a hierarchy expressed as
a flat structure with level numbers into a real XML hierarchy at
http://www.idealliance.org/proceedings/xml04/papers/111/mhk-paper.html
Michael Kay
http://www.saxonica.com/
> -----Original Message-----
> From: David Medley [mailto:DAVEMEDLEY@xxxxxxxxxx]
> Sent: 11 September 2007 15:27
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: Transforming flat ?WordML? source to a
> hierarchical XML output.
>
> Using following:
>
> Saxon XSLT processor, version 8.9
>
> XSLT 2.0
>
>
> I am trying to process XML source generated by Microsoft Word
> (WORDML).
>
> WordML has no concept of hierarchy, and so each paragraph in
> the source looks like below:
>
> <w:p>
> <w:pPr>
> <w:pStyle w:val="Normal"/>
> </w:pPr>
> <w:r>
> <w:rPr/>
> <w:t>Normal Paragraph</w:t>
> </w:r>
> </w:p>
> <w:p>
> <w:pPr>
> <w:pStyle w:val="Number"/>
> <w:listPr>
> <w:ilvl w:val="0"/>
> </w:listPr>
> </w:pPr>
> <w:r>
> <w:rPr/>
> <w:t>Top Level List</w:t>
> </w:r>
> </w:p>
> <w:p>
> <w:pPr>
> <w:pStyle w:val="Number"/>
> <w:listPr>
> <w:ilvl w:val="0"/>
> </w:listPr>
> </w:pPr>
> <w:r>
> <w:rPr/>
> <w:t>Top Level List</w:t>
> </w:r>
> </w:p>
> <w:p>
> <w:pPr>
> <w:pStyle w:val="Bulleted"/>
> <w:listPr>
> <w:ilvl w:val="1"/>
> </w:listPr>
> </w:pPr>
> <w:r>
> <w:rPr/>
> <w:t>Nested List Level 1</w:t>
> </w:r>
> </w:p>
> <w:p>
> <w:pPr>
> <w:pStyle w:val="Bulleted"/>
> <w:listPr>
> <w:ilvl w:val="1"/>
> </w:listPr>
> </w:pPr>
> <w:r>
> <w:rPr/>
> <w:t>Nested List Level 1</w:t>
> </w:r>
> </w:p>
> <w:p>
> <w:pPr>
> <w:pStyle w:val="Number"/>
> <w:listPr>
> <w:ilvl w:val="2"/>
> </w:listPr>
> </w:pPr>
> <w:r>
> <w:rPr/>
> <w:t>Nested List Level 2</w:t>
> </w:r>
> </w:p>
> <w:p>
> <w:pPr>
> <w:pStyle w:val="Number"/>
> <w:listPr>
> <w:ilvl w:val="3"/>
> </w:listPr>
> </w:pPr>
> <w:r>
> <w:rPr/>
> <w:t>Nested List Level 3</w:t>
> </w:r>
> </w:p>
> <w:p>
> <w:pPr>
> <w:pStyle w:val="Number"/>
> <w:listPr>
> <w:ilvl w:val="4"/>
> </w:listPr>
> </w:pPr>
> <w:r>
> <w:rPr/>
> <w:t>Nested List Level 4</w:t>
> </w:r>
> </w:p>
> <w:p>
> <w:pPr>
> <w:pStyle w:val="Number"/>
> <w:listPr>
> <w:ilvl w:val="4"/>
> </w:listPr>
> </w:pPr>
> <w:r>
> <w:rPr>
> <w:i/>
> </w:rPr>
> <w:t>Nested List Level 4</w:t>
> </w:r>
> </w:p>
> <w:p>
> <w:pPr>
> <w:pStyle w:val="Number"/>
> <w:listPr>
> <w:ilvl w:val="5"/>
> </w:listPr>
> </w:pPr>
> <w:r>
> <w:rPr>
> <w:b/>
> </w:rPr>
> <w:t>Nested List Level 5</w:t>
> </w:r>
> </w:p>
> <w:p>
> <w:pPr>
> <w:pStyle w:val="Number"/>
> <w:listPr>
> <w:ilvl w:val="5"/>
> </w:listPr>
> </w:pPr>
> <w:r>
> <w:rPr>
> <w:u w:val="single"/>
> </w:rPr>
> <w:t>Nested List Level 5</w:t>
> </w:r>
> </w:p>
> <w:p>
> <w:pPr>
> <w:pStyle w:val="Normal"/>
> </w:pPr>
> <w:r>
> <w:rPr/>
> <w:t>Normal Paragraph</w:t>
> </w:r>
> </w:p>
>
> This displays in word as follows:
>
> Normal Paragraph
> 1. Top Level List
> 2. Top Level List
> * Nested List Level 1
> * Nested List Level 1
> 1. Nested List Level 2
> a. Nested List Level 3
> i. Nested List Level 4
> ii. Nested List Level 4
> 1. Nested List Level 5
> 2. Nested List Level 5
> Normal Paragraph
>
>
> I need the outcome to be as follows:
>
> <Paragraph>Normal Paragraph</Paragraph>
> <List type="numbered">
> <Item>Top Level List</Item>
> <Item>Top Level List
> <List type="bulleted">
> <Item>Nested List Level 1</Item>
> <Item>Nested List Level 1
> <List type="numbered">
> <Item>Nested
> List Level 2
> <List type="
> numbered">
>
> <Item> Nested List Level 3
>
> < List type="numbered"> <Item>Nested List Level
> 4</Item> <Item>Nested List Level 4
> <List type="numbered">
> <Item>Nested List Level 5</Item>
> <Item>Nested List Level 5</Item>
> </List>
> </Item>
>
> </
> List>
>
> </Item>
> </List>
> </Item>
> </List>
> </Item>
> </List>
> </Item>
> </List>
> <Paragraph>Normal Paragraph</Paragraph>
>
>
> I think what is required is a grouping procedure, grouping
> the paragraphs depending on the value of x-path
> 'w:pPr/w:listPr/w:ilvl/@w:val' for each paragraph.
> My attempt to do this has been unsuccessful resulting in
> problems of not all paragraphs having the x-path
> 'w:pPr/w:listPr/w:ilvl/@w:val' and therefore the grouping falls over.
>
> I hope you can help me in this matter, thank you for reading.
>
>
> Thank you,
> David Medley
> IT Specialist
>
> Application Services, GBS
> IBM Office Internal: 299263 External: +44 (0) 1252 55 9263
> Mobile: +44 (0) 7790-778801
> E-mail: davemedley@xxxxxxxxxx
> Notes: David Medley/UK/IBM@IBMGB
>
>
>
>
>
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales
> with number 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth,
> Hampshire PO6 3AU
| Current Thread |
|
Michael Kay - 12 Sep 2007 10:39:27 -0000 <=
|
|