[Home] [By Thread] [By Date] [Recent Entries]
Dear all, This is a very interesting discussion. Let me add my two cents. We often compare XML and JSON in their “raw” form. I look at them and differentiate from a data modelling perspective. I like to advocate that JSON is only one of the many syntaxes/formats of the one and same general logical data model: (heterogeneous) DataFrames. This is as if the XML infoset, or PSVI, or XQuery&XPath Data Model had alternate syntaxes, like RDF also does (Turtle, RDF/XML, JSON-LD…). Other such DataFrame syntaxes/formats are Avro, Parquet, YAML, ProtoBufs, LibSVM, and so on. This is, in fact, the vision behind the JSONiq/JSound/RumbleDB ecosystem [1]. What (sadly) makes it hard for XML technologies to be accepted in the JSON/DataFrame community is that there is a significant impedance mismatch in the models, meaning XML “cannot be” one of the DataFrame syntaxes in a straightforward way: in JSON/DataFrames, names (=object keys, array positions) are on the edges; in XML, names (element/attribute names) are on the nodes. Elements and attributes are, for this and other reasons, not directly mappable to abstract data types like Maps and Lists, unlike JSON/DataFrame objects/structs and arrays, which partly explains why there are so many syntaxes and formats for DataFrames, as the use of these simple high-level data types is more amenable to the extensibility of the ecosystem. This also explain why JSON support needed a non-trivial extension of the XDM in XQuery 3.1. I said “sadly", because it still holds that 95% of the XML ecosystem is directly relevant for DataFrames and I wish this would be embraced more broadly by data scientists. This is why JSONiq uses as much as possible from XQuery and JSound from XML Schema. Kind regards, Ghislain [1] https://www.research-collection.ethz.ch/handle/20.500.11850/523504
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |

Cart



