Subject: RE: regex grouping precedence.
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Wed, 29 Sep 2004 09:19:41 +0100
|
The groups are numbered by counting left brackets: the 5th unescaped left
bracket starts group 5, regardless of where the closing brackets or are, and
regardless of other operators such as "|".
Michael Kay
http://www.saxonica.com/
> -----Original Message-----
> From: Pawson, David [mailto:David.Pawson@xxxxxxxxxxx]
> Sent: 29 September 2004 08:23
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: regex grouping precedence.
>
> http://www.w3.org/TR/xslt20/#element-matching-substring seems to say
> little about how nested grouping is numbered.
>
> (...) (....) ( ....)
> gives regex-group (1,2,3) OK.
>
> (...) (....)| ( ....)
> Is the third group counted as two due to the alternates?
> or still 3?
>
> (...) ( (..)(.).) ( ....)
> 1 2 3 4 5
> is how I would expect to number them,
> but I'm totally unsure.
>
> Using xslt 2 to parse a plain text file;
>
> Input string
> 500748,500748 ,Set My People Free
>
> regex
>
> <xsl:for-each select='tokenize($f, "[\r]?\n")'>
> <r>
>
> <xsl:analyze-string flags="ix"
> regex="([0-9]{{6}})
> (,,)|(,([0-9]{{6}})\p{{Zs}}+,(.*))$"
> select=".">
> <xsl:matching-substring>
> <bibno><xsl:value-of select="regex-group(1)"/></bibno>
> <ck><xsl:value-of
> select="normalize-space(regex-group(4))"/></ck>
> <ttl><xsl:value-of
> select="normalize-space(regex-group(5))"/></ttl>
> </xsl:matching-substring>
> <xsl:non-matching-substring>
> <n><xsl:value-of select="."/></n>
> </xsl:non-matching-substring>
> </xsl:analyze-string>
>
>
> </r>
> </xsl:for-each>
>
>
> output is
>
> <n>500748</n>
> <bibno/>
> <ck>500748</ck>
> <ttl>Set My People Free</ttl>
>
>
> Regards DaveP.
>
> **** snip here *****
>
> --
> DISCLAIMER:
>
> NOTICE: The information contained in this email and any
> attachments is
> confidential and may be privileged. If you are not the intended
> recipient you should not use, disclose, distribute or copy any of the
> content of it or of any attachment; you are requested to notify the
> sender immediately of your receipt of the email and then to delete it
> and any attachments from your system.
>
> RNIB endeavours to ensure that emails and any attachments generated by
> its staff are free from viruses or other contaminants. However, it
> cannot accept any responsibility for any such which are transmitted.
> We therefore recommend you scan all attachments.
>
> Please note that the statements and views expressed in this email and
> any attachments are those of the author and do not
> necessarily represent
> those of RNIB.
>
> RNIB Registered Charity Number: 226227
>
> Website: http://www.rnib.org.uk
|