Subject: Re: Find inconsistencies: Perl or XSLT?
From: Manuel Souto Pico <m.soutopico@xxxxxxxxx>
Date: Thu, 2 Dec 2010 01:23:49 +0100
|
Hi guys,
Thanks a lot for all your answers. It looks like XSLT can be used for
everything :)
What Michael wrote was exactly what I needed. I just tweaked a bit the
output, to make it (the output) more human-readable:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xd="http://www.oxygenxml.com/ns/doc/xsl"
exclude-result-prefixes="xd"
version="2.0">
<xsl:output method="text"/>
<xsl:template match="file">
<xsl:text>INCONSISTENCIES FOUND
</xsl:text>
<xsl:for-each-group select="unit" group-by="source">
<xsl:if test="count(distinct-values(current-group()/target)) gt 1">
<xsl:text>
</xsl:text>
<xsl:text>Segment [</xsl:text>
<xsl:value-of select="current-grouping-key()"/>
<xsl:text>]
translated as:
[</xsl:text>
<xsl:value-of select="distinct-values(current-group()/target)"
separator="] and 
["/>
<xsl:text>].
</xsl:text>
</xsl:if>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
So I get
INCONSISTENCIES FOUND
Segment [bleble]
translated as:
[pleple] and
[lolailo].
That function conflicts-for must be quite new, it's not in my O'Reilly book.
Once again: really, thanks a lot.
Cheers, Manuel
2010/12/1 Michael Kay <mike@xxxxxxxxxxxx>:
> On 01/12/2010 14:46, Manuel Souto Pico wrote:
>>
>> Dear all,
>>
>> I need to process some files and I know how to do it in Perl, but as
>> has happened to be the case in the past with other stuff, perhaps
>> there's a (objectively) simpler or more efficient way to do it with
>> XSLT.
>>
>> I have a file like this
>>
>> <unit id="1">
>> <source>blabla</source>
>> <target>plapla</source>
>> </unit>
>> <unit id="2">
>> <source>bleble</source>
>> <target>pleple</source>
>> </unit>
>> <unit id="3">
>> <source>bloblo</source>
>> <target>ploplo</source>
>> </unit>
>> <unit id="4">
>> <source>blabla</source>
>> <target>plapla</source>
>> </unit>
>> <unit id="5">
>> <source>bleble</source>
>> <target>lolailo</source>
>> </unit>
>>
>> I think the example is illustrative enough.
>>
>> The target element contains the translation of the source element, and
>> one same element must always be translated in the same way, but
>> sometimes it's not. So what I'd to do is find two or more units with
>> the same source but with different target (like 2 and 5 in the
>> example, but unlike 1 and 4).
>>
>> In Perl I would use a XML module (or not) and put the source elements
>> in the keys of a hash and the target elements in their corresponding
>> values. When assigning a new key-value pair, if the key already
>> exists, I compare the values. If they are equal, they pass, else they
>> are flagged and included in the report.
>>
>> The report in this case would be something like:
>>
>> The following inconsitencies have been found
>> 2: bleble -> pleple
>> 5: bleble -> lolailo
>>
>> Is it possible to do this in XSLT? Is it more efficient that doing it
>> in Perl as I was planning to? I knowledge of XSLT is very limited and
>> I can't see beyond transforming a XML file into another XML file.
>>
>> Thanks a lot for your opinion.
>> Manuel
>>
>>
> Something like this:
>
> <xsl:for-each-group select="unit" group-by="source">
> <xsl:if test="count(distinct-values(current-group()/target)) gt 1">
> <conflicts-for source="{current-grouping-key()}">
> <xsl:value-of select="distinct-values(current-group()/target)"/>
> </conflicts>
> </xsl:if>
> </xsl:for-each-group>
>
> Michael Kay
> Saxonica
|