[Home] [By Thread] [By Date] [Recent Entries]

  • From: COUTHURES Alain <alain.couthures@a...>
  • To: Jack Bush <netbeansfan@y...>
  • Date: Tue, 28 Oct 2008 12:24:01 +0100

It sounds like http://sourceforge.net/projects/light-html2xml failed to convert the html page to well-formed XML but I tried with http://www.abc.com (where there is just one form element) and the resulting XML was correct...

Alain COUTHURES
<agenceXML>
http://www.agencexml.com
Bordeaux, France

Jack Bush a écrit :
603219.32794.qm@w..." type="cite">
Hi All,

I appears to have difficulty closing (possibly flushing it first) an XML file that was subsequently being parsed without success. The error generated is:

org.jdom.input.JDOMParseException: Error on line 23: The element type "form" must be terminated by the matching end-tag "</form>".

Below is the code snippets of readData() to retrieve (HTML) data from a website, save it to a file, then convert to XML format before returning the new filename:
public String readData() {
 
    try {
          URL url  = new URL(http://www.abc.com);
          URLConnection connection = url.openConnection();      
          InputStream isInHtml = url.openStream();   // throws an IOException    
          disInHtml = new DataInputStream(new BufferedInputStream(isInHtml));         
          System.out.flush();
          FileOutputStream fosOutHtml = null;
          fosOutHtml = new FileOutputStream("C:\\Temp\\ABC.html");
          int oneChar, count=0;
          while ((oneChar=disInHtml.read()) != -1)
              fosOutHtml.write(oneChar);
          isInHtml.close();
          disInHtml.close();
          fosOutHtml.flush();    // optional
          fosOutHtml..close();
          .....
    }
 
    try {
          File fileInHtml = new File("C:\\Temp\\ABC.html");
          FileReader frInHtml = new FileReader(fileInHtml);
          BufferedReader brInHtml = new BufferedReader(frInHtml);
          String string = "";
          while (brInHtml.ready())
              string += brInHtml.readLine() + "\n";
          fwOutXml  = new FileWriter("C:\\Temp\\ABC.xml");
          pwOutXml  = new PrintWriter(fwOutXml);
          light_html2xml html2xml = new light_html2xml();
          pwOutXml.print(html2xml.Html2Xml(string));
          system.out.flush()     // optional
          fwOutXml.flush();      // optional
          fwOutXml.close();
          pwOutXml.flush();      // optional
          pwOutXml.close();
          return fileInHtml.getAbsolutePath();
          ....
    }
}
 
// parseData reads the XML file using the name returned by readData()
public void parseData(String XMLFilename)
{
    try
    {
        FileReader frInXml = new FileReader(FileName);
        BufferedReader brInXml = new BufferedReader(frInXml);
        SAXBuilder saxBuilder = new SAXBuilder("org.apache.xerces.parsers.SAXParser"); // JDOMParseException generated.
        ....
}
These codes would worked when they were in a single method but I have since placed some structure around them using a number methods.

This issue has risen in th past where I have been able to close the XML file prior to reading them again. However, I don't have a solution for it this time round.

I am running JDK 1.6.0_10, Netbeans 6.1, JDOM 1.1 on Windows XP platform.

Any assistance would be appreciated.

Many thanks,

Jack


Make the switch to the world's best email. http://au.rd.yahoo.com/mail/taglines/au/mail/default/*http://au.yahoo.com/y7mail.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member