On 15/02/2025 20:49, rick@xxxxxxxxxxxxxx wrote:
I have a flat file structure and am trying to add structure to it. Here is
my XML:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<div type="book" sID="book1">
<chapter sID="chapter1" n="1"/>
<p sID="p1.1" n="1"/>Paragraph 1 text.
<p eID="p1.1"/>
<p sID="p1.2" n="2"/>Paragraph 2 text with
<emphasis>emphasis</emphasis> added.
<p eID="p1.2"/>
<p sID="p1.3" n="3"/>Paragraph 3 text.
<p eID="p1.3"/>
<chapter eID="chapter1" n="1"/>
<chapter sID="chapter2" n="2"/>
<p sID="p2.1" n="1"/>Paragraph 1 text.
<p eID="p2.1"/>
<p sID="p2.2" n="2"/>Paragraph 2 text with
<emphasis>emphasis</emphasis> added.
<p eID="p2.2"/>
<p sID="p2.3" n="3"/>Paragraph 3 text.
<p eID="p2.3"/>
<p sID="p2.4" n="4"/>Paragraph 4 with <bold>bold</bold> text.
<p eID="p2.4"/>
<chapter eID="chapter2" n="2"/>
</div>
</root>
My desired output is this:
<?xml version="1.0" encoding="UTF-8"?>
<book>
<chapter number="1">
<title>chapter1</title>
<p>Paragraph 1 text.</p>
<p>Paragraph 2 text with <emphasis>emphasis</emphasis> added.</p>
<p>Paragraph 3 text.</p>
</chapter>
<chapter number="2">
<title>chapter2</title>
<p>Paragraph 1 text.</p>
<p>Paragraph 2 text with <emphasis>emphasis</emphasis> added.</p>
<p>Paragraph 3 text.</p>
<p>Paragraph 4 with <bold>bold</bold> text.</p>
</chapter>
</book>
Minus indentation I think
B B <xsl:template match="/root">
B B B B B B B <xsl:apply-templates select="./div[@sID='book1']"/>
B B B </xsl:template>
B B B <xsl:template match="div">
B B B B B B B <book>
B B B B B B B B B B B <xsl:for-each-group
select="*|text()[normalize-space()]"
group-starting-with="chapter[@sID]">
B B B B B B B B B B B B B B B <chapter number="{@n}">
B B B B B B B B B B B B B B B B B B B <title>{@sID}</title>
B B B B B B B B B B B B B B B B B B B <xsl:for-each-group
select="current-group() except
." group-starting-with="p[@sID]">
B B B B B B B B B B B B B B B B B B B <p>
B B B B B B B B B B B B B B B B B B B B B <xsl:apply-templates
select="current-group()
except ."/>
B B B B B B B B B B B B B B B B B B B </p>
B B B B B B B B B B B B B B B </xsl:for-each-group>
B B B B B B B B B B B B B B B </chapter>
B B B B B B B B B B B </xsl:for-each-group>
B B B B B B B </book>
B B B </xsl:template>
B B B <xsl:template match="p[@sID, @eID][not(node())] | chapter[@eID]"/>
achieves that.
Obviously if both chapters and paragraphs have start and end marker
elements it might also be worth wrapping group-starting-with around
group-ending-with (or go for XQuery windowing with start/end)
|