Hi Chris,
> I want to remove leading/trailing whitespace from certain DITA block
elements. For example, I want to turn this:
I have been working with DITA and found that there is *no needs* to
remove leading/trailing spaces when we publish it to PDF or HTML.
However one output format do needs removing whitespaces from the DITA
input. It is Microsoft Word (.docx) output.
I have implemented this feature in the following codes:
https://github.com/AntennaHouse/ah-wml/blob/master/com.antennahouse.wml/xsl/dita2wml_convmerged3.xsl
https://github.com/AntennaHouse/ah-wml/blob/master/com.antennahouse.wml/xsl/dita2wml_text_map.xsl
What is your use case that needs removing leading/trailing whitespace?
Regards,
On 2022/01/03 10:58, Chris Papademetrious
christopher.papademetrious@xxxxxxxxxxxx wrote:
> Hi Dimitre,
>
>
>
> Just some feedback from a novice... For me, this would be difficult to remember to determine if a node is in a sequence:
>
>
>
> exists(index-of($seq, $n, id-equal#2))
>
>
>
> A one-word operator for this would be easier for me to remember:
>
>
>
> $n in $seq
>
> $n is $seq
>
>
>
>
>
> Hi everyone (again),
>
>
>
> I was able to use the [$n intersect $seq] trick again today! And I'm proud of how it turned out, so I wanted to share it with you.
>
>
>
> I want to remove leading/trailing whitespace from certain DITA block elements. For example, I want to turn this:
>
>
>
> <p> This is some text.</p>
>
>
>
> into this:
>
>
>
> <p>This is some text.</p>
>
>
>
> But there are two tricky aspects:
>
>
>
> 1. The leading/trailing whitespace could be buried in a lower-level inline element:
>
>
>
> <p> Here is some text.</p>
>
> <p><b> Here</b> is some text.</p>
>
> <p><b><i> Here</i></b> is some text.</p>
>
>
>
> so I need to match the first effectively rendered descendant text() node of these block elements.
>
>
>
> 2. Some DITA block elements allow other DITA block elements in them:
>
>
>
> <p> This is a paragraph element.</p>
>
> <li> This is a list element.</li>
>
>
>
> <li>
>
> <p> This is a paragraph element in a list element.</p>
>
> </li>
>
>
>
> so I need the sibling-adjacency check to stop at the lowest-level enclosing block element.
>
>
>
> Here are the templates I came up with:
>
>
>
>
>
> <!-- look for leading/trailing text() nodes in these block elements -->
>
> <xsl:variable name="elements" select="//(desc|dt|entry|glossterm|li|p|pre|shortdesc|title)"/>
>
>
>
> <!-- remove leading whitespace from leading text() nodes in block elements -->
>
> <xsl:template match="text()
>
> [matches(., '^\s+')]
>
> [ancestor::*[. intersect $elements][not(descendant::*[. intersect $elements])]]
>
> [not(ancestor-or-self::node()
>
> [ancestor::*[. intersect $elements][not(descendant::*[. intersect $elements])]]
>
> [preceding-sibling::node()]
>
> )]">
>
> <xsl:variable name="results">
>
> <xsl:next-match/> <!-- apply other templates, if needed -->
>
> </xsl:variable>
>
> <xsl:value-of select="replace($results, '^\s+', '')"/>
>
> </xsl:template>
>
>
>
> <!-- remove trailing whitespace from trailing text() nodes in block elements -->
>
> <xsl:template match="text()
>
> [matches(., '\s+$')]
>
> [ancestor::*[. intersect $elements][not(descendant::*[. intersect $elements])]]
>
> [not(ancestor-or-self::node()
>
> [ancestor::*[. intersect $elements][not(descendant::*[. intersect $elements])]]
>
> [following-sibling::node()]
>
> )]">
>
> <xsl:variable name="results">
>
> <xsl:next-match/> <!-- apply other templates, if needed -->
>
> </xsl:variable>
>
> <xsl:value-of select="replace($results, '\s+$', '')"/>
>
> </xsl:template>
>
>
>
>
>
> Basically, it goes something like:
>
>
>
> Find the text() node:
>
>
>
> * That has leading/trailing whitespace
>
> * That is within a block element that does not contain some other lower-level block element
>
> * That is not itself, or has no ancestor up to (but not including) that block element, with a preceding/following sibling
>
>
>
> The hardest part was figuring out how to get all ancestors up to the first block element, but not past that. The nesting of [descendant::*[...]] within [ancestor::*] is probably not the most performant way to do this, but it gets the job done.
>
>
>
> And by using <xsl:next-match/>, the templates can work together to remove both leading and trailing whitespace from the same text() node, if needed.
>
>
>
> - Chris
>
>
>
>
>
>
>
--
/*--------------------------------------------------
Toshihiko Makita
Development Group. Antenna House, Inc. Ina Branch
E-Mailtmakita@xxxxxxxxxxxxx
8077-1 Horikita Minamiminowa Vil. Kamiina Co.
Nagano Pref. 399-4511 Japan
Tel +81-265-76-9300 Fax +81-265-78-1668
Web site:
http://www.antenna.co.jp/
http://www.antennahouse.com/
--------------------------------------------------*/
|