The standard Java interface for XML parsing, JAXP, provides two methods for
parsing XML: DOM and SAX. The DOM (document object model) parses the input
into an XML tree using the org.w3c.xml.Node API. SAX doesn't create any
result object tree, instead calling methods in a ContentHandler.
In general, DOM parsing is easier to understand, but SAX parsing can
be more efficient because many applications can avoid creating an
intermediate XML tree.
For strict XML parsing to the DOM, the best technique is to use the
standard JAXP API. That way, you can configure your application to
use whichever XML parser is most convenient for you.
JAXP parsing uses the following steps:
- Create a DocumentBuilderFactory instance.
- Set any parser flags or properties.
- Create DocumentBuilder to create the parser.
- Parse the document
Reading and Writing using the DOM
import java.io.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
import com.caucho.xml.*;
...
// Create a new parser using the JAXP API (javax.xml.parser)
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder parser = factory.newDocumentBuilder();
// Parse the file into a DOM Document (org.w3c.dom)
Document doc = parser.parse("test.xml");
// Create a new XML printer (com.caucho.xml)
FileOutputStream os = new FileOutputStream("out.xml");
XmlPrinter printer = new XmlPrinter(os);
// Print the document
printer.print(doc);
os.close();
|
Printing DOM objects is easily done by using Resin's XmlPrinter API.
JAXP is a standard interface which supports pluggable XML
parser implementations. JAXP selects the parser based on
system properties. You can set the properties to select
a different parser than the default one.
If you use libraries which include JAXP classes, those libraries
might default to another, probably slower, XML parser. You may
need to configure the system properties to ensure that you'll use
Resin's fast parser.
JAXP Properties for Resin
system property | Resin value
|
javax.xml.parsers.DocumentBuilderFactory
| com.caucho.xml.parsers.XmlDocumentBuilderFactory
|
javax.xml.parsers.SAXParserFactory
| com.caucho.xml.parsers.XmlSAXParserFactory
|
javax.xml.transform.TransformerFactory
| com.caucho.xsl.Xsl
|
JAXP Properties for Xalan/Xerces
system property | Xerces value
|
javax.xml.parsers.DocumentBuilderFactory
| org.apache.xerces.jaxp.DocumentBuilderFactoryImpl
|
javax.xml.parsers.SAXParserFactory
| org.apache.xerces.jaxp.SAXParserFactoryImpl
|
javax.xml.transform.TransformerFactory
| org.apache.xalan.processor.TransformerFactoryImpl
|
The resin.conf and web.xml will let you configure system properties on
a per-application basis. The configuration looks like:
<web-app>
<system-property javax.xml.parsers.DocumentBuilderFactory=
"com.caucho.xml.parsers.XmlDocumentBuilderFactory"/>
...
</web-app>
|
Copyright © 1998-2002 Caucho Technology, Inc. All rights reserved.
Resin® is a registered trademark,
and HardCoretm and Quercustm are trademarks of Caucho Technology, Inc. | |
|