Sorry to post a possibly silly/slightly off topic question, but I am
struggling to narrow down the options for what should be one of the first
steps in working with XML files.
Essentially, I have a list of very long (100 pages) documents in .txt format,
which are derived from companies' annual reports (I did OCR on PDF files to
extract the txt files).
Now I need to mark the different sections of these text files, so I can do
text analysis on them, using the section from which the text come as a
variable (say, comparing the environmental sustainability section between
companies).
I want to mark the different sections according to a simple scheme.
I have tried to google editors for xml, and I have found too many. Can I
please ask for advice to narrow down the field?
If the solution proposed was open source (like Libre Office), even better (but
this is a nice to have, rather than essential).
Thanks and apologies again for the naive question: the amount of options I saw
confused me, so I thought it better to ask the experts.
Kind regards
Massimiliano Volpi
|