Re: [xsl] Seek ways to make my streaming XSLT code run faste

Cart

XML Editor - Download a Free Trial >

See What's New >

Buy Now >

[Home] [By Thread] [By Date] [Recent Entries]

Subject: Re: Seek ways to make my streaming XSLT code run faster (My streaming XSLT program has been running 12 hours and is only a quarter of the way to completion)
From: "Sheila Thomson coder@xxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 9 Aug 2025 22:47:52 -0000

Might it be quicker to load this document into an XML dB and use XQuery?  Is
that an option?

Sheila

On 9 August 2025 23:39:41 BST, "Martin Honnen martin.honnen@xxxxxx"
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
>On 10/08/2025 00:25, Roger L Costello costello@xxxxxxxxx wrote:
>> Hi Folks,
>>
>> My XML document consists of 5 million <record> elements:
>>
>> <records>
>>        <record>...</record>
>>        <record>...</record>
>> </records>
>>
>> Each <record> element has a child element that indicates the type of
(aviation) data in the record:
>>
>> <records>
>>      <record>
>>          <VHF_NAVAID_Primary_Records>...</VHF_NAVAID_Primary_Records>
>>      </record>
>>      <record>
>>          <Airport_SID_Primary_Records>...</Airport_SID_Primary_Records>
>>      </record>
>> </records>
>>
>> Each of the child elements contain elements appropriate to its type:
>>
>> <records>
>>      <record>
>>          <VHF_NAVAID_Primary_Records>
>>              <VOR_Identifier>ABC </VOR_Identifier>
>>              <DME_Ident>AND </DME_Ident>
>>          </VHF_NAVAID_Primary_Records>
>>      </record>
>>      <record>
>>          <Airport_SID_Primary_Records>
>>              <SID_Identifier>ABC </SID_Identifier>
>>          </Airport_SID_Primary_Records>
>>      </record>
>> </records>
>>
>> I want to find all <record> elements whose child is not an
<Airport_SID_Primary_Records> element and whose child element contains an
identifier element with value "ABC ." Here's the output I desire:
>>
>> <results>
>>      <result>
>>          <identifier>ABC </identifier>
>>          <record>VHF_NAVAID_Primary_Records</record>
>>          <field>
>>              <VOR_Identifier>ABC </VOR_Identifier>
>>          </field>
>>      </result>
>> </results>
>>
>> Identifier "ABC " is just one of 1900 identifiers. These identifiers are
stored in an XML file, identifiers.xml
>>
>> <identifiers>
>>     <identifier>ABC </identifier>
>>     <identifier>DEF </identifier>
>> </identifiers>
>>
>> I want to iterate over all 1900 identifiers and for each of them, iterate
over all 5 million records to see which records contain the identifier. There
is a loop within a loop:
>>
>> For each 1900 identifiers do
>>      For each 5 million records do
>>           Check record against identifier
>>
>> I am using streaming XSLT to accomplish this task.
>>
>> My streaming program has been running 12 hours and it has only processed a
quarter of the identifiers. I'd like to see if you have suggestions on ways to
speed up my streaming program. I am thinking that this part of my program is
probably slow:
>>
>> <xsl:for-each select="*[name(.) ne 'Airport_SID_Primary_Records']
>>      		[name(.) ne 'Airport_STAR_Primary_Records']
>>      		[name(.) ne 'Airport_Approach_Primary_Records']
>>     		[ends-with(name(.),'Primary_Records')]">
>>      <xsl:for-each select="*[(. eq $identifier) and (name(.) ne
'Recommended_Navaid')]">
>>          <result>
>>              <identifier><xsl:value-of select="$identifier"/></identifier>
>>              <record><xsl:value-of select="name(..)"/></record>
>>              <field><xsl:sequence select="."/></field>
>>          </result>
>>      </xsl:for-each>
>> </xsl:for-each>
>>
>> That code is for processing a <record> element. The code checks that the
<record> element's child element is not an <Airport_SID_Primary_Records>
element, not an <Airport_STAR_Primary_Records> element, not an
<Airport_Approach_Primary_Records> element, and its element name ends with
"Primary_Records". If the <record> element's child element satisfies all those
criteria, then the code iterates over all the elements inside the <record>
element's child element that contain a value matching the identifier and with
an element name not equal to "Recommended_Navaid."
>>
>> Is there a way to rewrite the code to make it execute faster?
>>
>> Here is my complete program:
>>
>> <xsl:stylesheet 	xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>>      		xmlns:xs="http://www.w3.org/2001/XMLSchema"
>>      		exclude-result-prefixes="#all"
>>      		version="3.0">
>>           <xsl:output method="xml" />
>>           <xsl:variable name="identifiers"
select="doc('identifiers.xml')/*"/>
>>           <xsl:template name="main">
>>          <results>
>>              <xsl:for-each select="$identifiers/*">
>>                  <xsl:variable name="identifier" select="."
as="xs:string"/>
>>                  <xsl:source-document href="records.xml" streamable="yes">
>
>
>So that approach processes the records.xml 1900 times with streaming.
>
>Is the order of the resulting elements important? Otherwise you could stream
once and check all your identifiers for each record, as I tried to indicate in
my first answer:
>
>
><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>B  B  xmlns:xs="http://www.w3.org/2001/XMLSchema"
>B  B  exclude-result-prefixes="#all"
>B  B  version="3.0">
>
>B  B  <xsl:output method="xml" />
>
>B  B  <xsl:variable name="identifiers" select="doc('idents.xml')/*"/>
>
>B  B  <xsl:template name="main">
>B  B  B  B  <xsl:source-document href="records.xml"
>streamable="yes">
>B  B  B  B  B  B  <results>
>
>B  B  B  B  B  B  B  B  B  B  <xsl:for-each select="records/record">
>B  B  B  B  B  B  B  B  B  B  B  B  <xsl:variable name="record"
select="copy-of(.)"/>
>B  B  B  B  B  B  B  B  B  B  B  B  <xsl:for-each
select="$identifiers/identifier">
>B  B  B  B  B  B  B  B  B  B  B  B  B  B <xsl:variable name="ident"
select="."/>
>B  B  B  B  B  B  B  B  B  B  B  B  B  B ...B  (use $ident to process
$record)
>B  B  B  B  B  B  B  B  B  B  B  B  </xsl:for-each>
>B  B  B  B  B  B  B  B  B  B  </xsl:for-each>
>B  B  B  B  B  B  B  B  </xsl:for-each>
>B  B  B  B  B  B  </results>
>B  B  B  B  </xsl:source-document>
>B  B  </xsl:template>
>
></xsl:stylesheet>
>
>
>That should give you the same elements as your intent, but only streaming
once through the 5 millions records. The order of elements in the result will
be different perhaps, not sure whether it matters.

Current Thread

Seek ways to make my streaming XSLT code run faster (My streaming XSLT program has been running 12 hours and is only a quarter of the way to completion)
- Roger L Costello costello@xxxxxxxxx - 9 Aug 2025 22:25:27 -0000
  - Martin Honnen martin.honnen@xxxxxx - 9 Aug 2025 22:39:40 -0000
    - Sheila Thomson coder@xxxxxxxxxxxxxxx - 9 Aug 2025 22:47:52 -0000 <=
      - Graydon graydon@xxxxxxxxx - 9 Aug 2025 22:57:35 -0000
    - Martin Honnen martin.honnen@xxxxxx - 9 Aug 2025 22:50:13 -0000
  - Liam R. E. Quin liam@xxxxxxxxxxxxxxxx - 9 Aug 2025 22:59:57 -0000
    - Liam R. E. Quin liam@xxxxxxxxxxxxxxxx - 9 Aug 2025 23:32:18 -0000

<- Previous	Index	Next ->
Re: Seek ways to make my stre, Martin Honnen martin	Thread	Re: Seek ways to make my stre, Graydon graydon@xxxx
Re: Seek ways to make my stre, Martin Honnen martin	Date	Re: Seek ways to make my stre, Martin Honnen martin
	Month

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >