[Home] [By Thread] [By Date] [Recent Entries]
Hi Folks, This is an outstanding discussion. Many questions have been raised. It would be good to collect together all the questions: > How many XML files are to be stored and queried? How big are they? There are 50 million XML files, each 50MB in size. > What's the complexity of the XML: is there deep nesting or is it flat? The files are mostly flat (not deeply nested). > Are the XML files volatile or static? The XML files are relatively static - a few are updated for errors but most stay the same. > Are there requirements for further processing or consuming them as XML > elsewhere or are they just a query source? The XML files are just a query source. The results of the queries on the XML documents are used as input to SAS and SPSS analytics. > What type of queries, with what frequency? We want multiple people to query multiple times a day. Right now the query frequency is low because the queries take days to run. > What kind of queries do you will need to perform? Full text queries? XPath? XQuery? The queries are done using XPath and XQuery. > Do you know or care what the document vocabularies are? The XML elements and attributes are very well known. The structure of the XML is well known. Question: What is your recommendation for storing and querying this huge amount of XML? /Roger
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |

Cart



