[Home] [By Thread] [By Date] [Recent Entries]

  • From: "Costello, Roger L." <costello@m...>
  • To: "xml-dev@l..." <xml-dev@l...>
  • Date: Thu, 5 Mar 2015 16:59:44 +0000

Hi Folks,

This is an outstanding discussion. Many questions have been raised. It would be good to collect together all the questions: 

>	How many XML files are to be stored and queried? How big are they?

There are 50 million XML files, each 50MB in size.

>	What's the complexity of the XML: is there deep nesting or is it flat?

The files are mostly flat (not deeply nested).

>	Are the XML files volatile or static?

The XML files are relatively static - a few are updated for errors but most stay the same.

>	Are there requirements for further processing or consuming them as XML 
>	elsewhere or are they just a query source?

The XML files are just a query source. The results of the queries on the XML documents are used as input to SAS and SPSS analytics.

>	What type of queries, with what frequency?

We want multiple people to query multiple times a day. Right now the query frequency is low because the queries take days to run.

>	What kind of queries do you will need to perform? Full text queries? XPath? XQuery?

The queries are done using XPath and XQuery.

>	Do you know or care what the document vocabularies are?

The XML elements and attributes are very well known. The structure of the XML is well known.

Question: What is your recommendation for storing and querying this huge amount of XML?

/Roger
 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member