I realize this is the XSL list, and don't get me wrong, I *love*
XSLT. And while I'm singing XSLT's (and thus XPath's) praises, this
particular task looks like a fun one to attack with Hans-J|rgen's
FOXpath (which is an extension of XPath to handle the file
system).[1]
But that said, this strikes me as a task better handled by your shell
than you XSLT engine, no? In bash, e.g.,
$ fgrep -f filenames_from_directory_listing.txt dir1/*.xml dir2/*.xml
gives you the answer, as it were, but not in the format you want.
I think to get the results you want (the phrase "[filename] was found
in [filepath]") you have to issue the fgrep command once for each
search term, instead of all-at-once. E.g., I think the following will
do the trick.
$ for fn in `cat filenames_from_directory_listing.txt` ; do fgrep -l -e $fn
dir1/*.xml dir2/*.xml | perl -pe "s,^.*\$,$fn was found in \$&,;" ; done
These methods presume that none of the names in filenames_from_
directory_ listing contain any whitespace.
And, of course, one thing that makes this nice is by just using
`egrep` instead of `fgrep`, you can search for regular expressions,
e.g., "meeting_schema\.(rn[cg]|xsd?|wxs|odd|dtd|(iso)?sch)". :-)
Notes
-----
[1] See
https://www.balisage.net/Proceedings/vol17/html/Rennau01/BalisageVol17-Rennau
01.html
> Hi this is my first post here - looking for help - apologies if
> there's something I've overlooked!
>
> I have a tokenized variable that contains list of filenames from a
> .txt of a directory listing. I want to look for those filenames in
> a number of xml files in a number of subdirectories. If the
> filename is found, I want to output that "filename" was found in
> "xmlfile".
>
> There are a lot of xml directories and they are not static. Same
> with xml files. The filenames are not tagged in the xml, so I'm
> just looking for their plain text occurence in the file.
>
> Any help would be appreciated.
>
> to make the examples easier - I want to use
>
> $filenames_to_find (tokenized list of filenames from a .txt
> directory listing)
>
> to search against
>
> dir1/*.xml
> dir2/*.xml
> with the output being
>
> filename was found in xmlfilename
>
> I'm using an academic version of Oxygen XML so I think I have Saxon
> through that and I have the standalone Saxon file for running this
> from the command line.
>
> I've gotten this far, but it doesn't work. I know it's broken, but
> I don't know how to fix it!
>
> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> xmlns:xs="http://www.w3.org/2001/XMLSchema"
> xmlns:h="http://www.w3.org/1999/xhtml"
> exclude-result-prefixes="xs"
> version="3.0"
> expand-text="yes"
> >
>
> <xsl:variable name="filenames_from_directory_listing"
> as="xs:string"
> select="unparsed-text('filenames_from_directory_listing.txt')"/>
> <xsl:variable name="filenames_to_find"
> select="tokenize($filenames_from_directory_listing, '\s+')"/>
>
> <xsl:template match="/">
> <xsl:for-each select="collection('.?select=*.xml;recurse=yes')"/>
> <xsl:variable name="xml_filenames" select="."/>
> <xsl:for-each select="$filenames_to_find">
> <xsl:if test="(contains($t, .))">
> <xsl:message>{document-uri($xml_filenames)} contains {.}</xsl:message>
> </xsl:if>
> </xsl:for-each>
> </xsl:template>
> </xsl:stylesheet>
>
> Any suggestions? Clearly I am an XSL novice. Thanks for your patience.
--
Syd Bauman, NRP
Senior XML Programmer/Analyst
Northeastern University Women Writers Project
s.bauman@xxxxxxxxxxxxxxxx or
Syd_Bauman@xxxxxxxxxxxxxxxx
| Current Thread |
|
Syd Bauman s.bauman@xxxxxxxxxxxxxxxx - 4 Oct 2018 16:50:35 -0000 <=
|
|