[Home] [By Thread] [By Date] [Recent Entries]

Subject: Approaches for validating database transformations using XML/XSLT tools?
From: "Roger L Costello costello@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sun, 15 Mar 2026 18:49:44 -0000
Hi Folks,
I have been given a very short-term task (40 hours total) to do some testing
and I wanted to get your input before starting.
A person whom I will call Jane wrote Python code that extracts data from
tables in a Workday database, processes the data, and then inserts the
processed data into a user database called the Semantic database from which
users will make their queries. The structure of the Workday database is
considered too complicated for users to query directly. The Semantic database
is simpler (e.g., table names and column headers are more user-friendly) and
contains less data (users do not need all the details that are in the Workday
tables).
My task is to test whether the Python code$B!G(Bs transformation is
complete, consistent, and accurate.
My plan is to leverage XML tools.
I can use SQL to obtain the data that the Python code extracted from the
Workday database. The front-end application to Workday and Semantic allows me
to run SQL commands. The application can display the retrieved data in a raw
form or export it as a CSV file.
Here are two possible approaches that I could take:
1. Export to CSV and convert to XML via Excel
Get a CSV file from querying the Workday database. Do the same for the
Semantic database. Open each CSV file in Excel and then File $B"*(B Save As
XML. Now I have XML. (Yes!) I can then create an XML Schema for the Workday
data and an XML Schema for the Semantic data. I can also create a Schematron
schema to check constraints between the two datasets. Validate, validate,
validate.
The advantage of this approach is that I can get the data into XML quickly.
The disadvantage is that Excel, when generating XML, might silently change
some data-for example, dropping non-printable characters or converting data
from the character set used by Workday into UTF-8. Do you know whether Excel
performs such behind-the-scenes transformations? Such transformations would be
undesirable-I want full knowledge of what data is in the Workday database so I
can see if any data is missing or altered in the Semantic database.
2. Extract raw text and transform with XSLT
I could query the Workday database using the front-end application, capture
the raw data displayed by the application into a text file (select, copy,
paste), and then use the XSLT unparsed-text() function to read the data and
generate XML (while carefully inspecting for non-printable characters and
character-set issues). I would do the same for the Semantic database.
The advantage of this approach is that I would have more control over how the
XML is generated and there would be no silent, behind-the-scenes conversions.
The disadvantage is that writing XSLT to carefully inspect the data and
generate XML will take time, and I do not have much time.
Perhaps there are other approaches?
What do you recommend?
/Roger

Current Thread
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member