[Home] [By Thread] [By Date] [Recent Entries]

  • To: "Alexey N. Shananin" <shananin@p...>
  • Subject: RE: doc2xml
  • From: "Manos Batsis" <m.batsis@b...>
  • Date: Thu, 6 Jun 2002 12:30:26 +0300
  • Cc: <xml-dev@l...>
  • Thread-index: AcINO1n5HJUzyctHQZ+SJg8R8Os82AAABg6w
  • Thread-topic: doc2xml


> From: Alexey N. Shananin [mailto:shananin@p...] 

> We working on Linux platform and it would be quite hard to 
> keep one machine 
> with Office for that task. but if thres's no other way...
> Do you use Microsoft DLLs to produce xml from doc?

Nope. Usually, the submission of MS Office HTML output is via a web
interface; after that we follow the process as described in my previous
post.
There is one more method that unfortunately requires MS IE as a browser;
in that case the remote user just performs copy > paste from his Word
application to the DHTMLEdit component in the browser. No HTML output
from MS Word is required; just copy paste. After that we submit the
content of the DHTMLEdit edit to the server to follow the same procedure
as mentioned before. This last method occurred after building a web
based visual HTML editor (one of the many).

If your users have MS Win, this will probably work fine since they'll
have IE. You can probably use a Linux server to do the rest (Jtidy [1])
although I haven't employed the Java version of Tidy myself.

[1] http://lempinen.net/sami/jtidy/

HTH,

Manos


> 
> On Thursday 06 June 2002 13:06, you wrote:
> > We use the HTML output of Microsoft office after filtering 
> it through
> > HTMLTidy (java version available). To further process the 
> XHTML to some
> > other format we either use XSLT or SAX based code.
> >
> > > -----Original Message-----
> > > From: Alexey N. Shananin [mailto:shananin@p...]
> > > Sent: Thursday, June 06, 2002 12:00 PM
> > > To: xml-dev@l...
> > > Subject:  doc2xml
> > >
> > >
> > > Hi!
> > > I'm looking for some tool which could _correctly_ convert
> > > Microsoft Word
> > > documrnts to XML.
> > > I tried AbiWord, it has some troubles with tables and
> > > pictures, I tried
> > > Davisor Offisor, it outputs some kind of crap. I tried Majix,
> > > it doesn't work.
> > > Is there any Java pplication which I could embed in my
> > > system(Java) to
> > > process Word documents?
> > > Java preferable, but not necessary...
> > >
> > > Thanks,
> > > Alexey.
> > >
> > > -----------------------------------------------------------------
> > > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> > > initiative of OASIS <http://www.oasis-open.org>
> > >
> > > The list archives are at http://lists.xml.org/archives/xml-dev/
> > >
> > > To subscribe or unsubscribe from this list use the subscription
> > > manager: <http://lists.xml.org/ob/adm.pl>
> >
> > -----------------------------------------------------------------
> > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> > initiative of OASIS <http://www.oasis-open.org>
> >
> > The list archives are at http://lists.xml.org/archives/xml-dev/
> >
> > To subscribe or unsubscribe from this list use the subscription
> > manager: <http://lists.xml.org/ob/adm.pl>
> 

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member