[Home] [By Thread] [By Date] [Recent Entries]
Hi Folks, Once again, many thanks for your excellent comments. Below I examine a tightly focused scenario (a common one, I think). I seek your thoughts on it. SCENARIO A web service is deployed. It has a URL. When the URL is dereferenced (by a client), the web service returns some data. The data has well-defined: - syntax, expressed by a grammar-based language (XML Schema, Relax NG, or DTD) - relationships, expressed by a Schematron schema - semantics, expressed by a data dictionary, ontology, English prose, or some combination thereof. Over time the data changes. That is, the syntax, relationships, and semantics change. There are N clients. A client dereferences the URL to retrieve the data. Each client is a "sink" of the data. That is, a client consumes the data, it does not forward it to another party. PROBLEM Identify ways for the web service to version the data. OTHER STRATEGIES POSSIBLE Below are two versioning strategies. Others are possible, but I think that these will be good to start with. DATA VERSIONING STRATEGY #1 - ONE ACTIVE VERSION OF THE DATA The web service supports only one version of the data at a time. The web service has only one URL. When the web service updates to a new version, it discontinues support for the old version. Clients are required to update in lockstep. Changes to the data are based on changing data requirements. Advantages 1. The data is unconstrained in how it evolves. That is, it is not constrained to only backward- or forward-compatible changes. This is good, as it allows the data to evolve based on application requirements and not on technology limitations. 2. The machinery behind the web service - application code and database - has to handle only one version. This is good, there is no redundancy, which may minimize cost. 4. The web service's business process is simply: "Here's the data." There is no need for creating extensible schemas and no need to tell clients "Accept unknown elements." For my clients, extensible schemas are a security risk, so having schemas that specify exactly what is permitted is good. Disadvantages 1. Clients may not be able to migrate at the same pace as the web service. Dawdling clients will be locked out of using the web service. 2. Each new version may entail large and costly changes for client applications. If the data was changed based on technology constraints, using backward- or forward-compatible design approaches, rather than based purely on changes to data requirements (which may not result in backward- or forward-compatible changes), it may make changes to client applications more incremental and less costly. (Note: this is conjecture. Evidence is needed. What are your thoughts on this?) 3. Many little changes would be overwhelming to the clients; so, due to pressure from clients, the web service will likely be forced to make large, infrequent changes. This may be bad for clients who need "that little piece of data ASAP." DATA VERSIONING STRATEGY #2 - MULTIPLE ACTIVE VERSIONS OF THE DATA Every time a new version of the data is created, a new URL is created. Old versions are maintained. Clients use whatever version they desire. Changes to the data are based on changing data requirements. Advantages 1. Clients can upgrade to a new version at their leisure. No lockstep migration. This is good, as it allows clients to upgrade when they have the necessary resources and the need. 2. The data is unconstrained in how it evolves. That is, it is not constrained to only backward- or forward-compatible changes. This is good, as it allows the data to evolve based on application requirements and not on technology limitations. 3. The web service and clients are decoupled, each can evolve at their own pace. This is good, as the web service's advances are not hindered by dawdling clients. 4. The web service's business process is simply: "Here's the data." There is no need for creating extensible schemas and no need to tell clients "Accept unknown elements." For my clients, extensible schemas are a security risk, so having schemas that specify exactly what is permitted is good. 5. Rapid changes to the data can be made without impacting clients. This is good because oftentimes a small additional piece of data is needed, and it's needed ASAP. Disadvantages 1. The machinery behind the web service - application code and database - must handle multiple concurrent versions. This is bad, as there may be redundancy and extra cost. 2. Each new version may entail large and costly changes for client applications. If the data was changed based on technology constraints, using backward- or forward-compatible design approaches, rather than based purely on changes to data requirements (which may not result in backward- or forward-compatible changes), it may make changes to client applications more incremental and less costly. (Note: this is conjecture. Evidence is needed. What are your thoughts on this?) QUESTIONS 1. Can you think of other advantages and disadvantages of these two versioning strategies? 2. The above discussion pits: - changing data due purely to changes in data requirements versus - changing data based on a desire to keep the data backward- or forward-compatible "Evolving data due to application requirements versus technology limitations." Is it fair to say that these represent competing desires? /Roger [Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |

Cart



