XML seems to add an unnecessary layer to this process and will introduce
other sources of errors -- especially if you add Excel to your pipeline.
It is a data mangler.
As Dave suggested, Python unit tests is a good start.
Then I'd write SQL scripts to compare the intended with the expected output
if you want more depth.
On Sun, Mar 15, 2026 at 2:52b/PM Dave Pawson dave.pawson@xxxxxxxxx <
xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> https://docs.python.org/3/library/unittest.html might be more appropriate.
>
> regards
>
> On Sun, 15 Mar 2026 at 18:49, Roger L Costello costello@xxxxxxxxx <
> xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
>> Hi Folks,
>>
>> I have been given a very short-term task (40 hours total) to do some
>> testing and I wanted to get your input before starting.
>>
>> A person whom I will call Jane wrote Python code that extracts data from
>> tables in a Workday database, processes the data, and then inserts the
>> processed data into a user database called the *Semantic* database from
>> which users will make their queries. The structure of the Workday database
>> is considered too complicated for users to query directly. The Semantic
>> database is simpler (e.g., table names and column headers are more
>> user-friendly) and contains less data (users do not need all the details
>> that are in the Workday tables).
>>
>> My task is to test whether the Python codebs transformation is
*complete,
>> consistent, and accurate*.
>>
>> My plan is to leverage XML tools.
>>
>> I can use SQL to obtain the data that the Python code extracted from the
>> Workday database. The front-end application to Workday and Semantic allows
>> me to run SQL commands. The application can display the retrieved data in
a
>> raw form or export it as a *CSV file*.
>>
>> Here are two possible approaches that I could take:
>>
>> *1. Export to CSV and convert to XML via Excel*
>>
>> Get a CSV file from querying the Workday database. Do the same for the
>> Semantic database. Open each CSV file in Excel and then *File b Save As
>> XML*. Now I have XML. (Yes!) I can then create an XML Schema for the
>> Workday data and an XML Schema for the Semantic data. I can also create a
>> Schematron schema to check constraints between the two datasets. Validate,
>> validate, validate.
>>
>> The advantage of this approach is that I can get the data into XML
>> quickly.
>>
>> The disadvantage is that Excel, when generating XML, might silently
>> change some databfor example, dropping non-printable characters or
>> converting data from the character set used by Workday into UTF-8. Do you
>> know whether Excel performs such behind-the-scenes transformations? Such
>> transformations would be undesirablebI want full knowledge of what data
is
>> in the Workday database so I can see if any data is missing or altered in
>> the Semantic database.
>>
>> *2. Extract raw text and transform with XSLT*
>>
>> I could query the Workday database using the front-end application,
>> capture the raw data displayed by the application into a text file
(select,
>> copy, paste), and then use the XSLT unparsed-text() function to read the
>> data and generate XML (while carefully inspecting for non-printable
>> characters and character-set issues). I would do the same for the Semantic
>> database.
>>
>> The advantage of this approach is that I would have more control over how
>> the XML is generated and there would be no silent, behind-the-scenes
>> conversions.
>>
>> The disadvantage is that writing XSLT to carefully inspect the data and
>> generate XML will take time, and I do not have much time.
>>
>> Perhaps there are other approaches?
>>
>> What do you recommend?
>>
>> /Roger
>> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
>> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/2607481> (by
>> email)
>>
>
>
> --
> Dave Pawson
> XSLT XSL-FO FAQ.
> Docbook FAQ.
> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/975685> (by
> email <>)
|