Here is part 3 of a 4-part chapter excerpt from the book Java XML Programmers Reference by Mohammad Akif, Steven Brodhead, Andrei Cioroianu, James Hart, Eric Jung and Dave Writz; ISBN 1861005202; published July 2001, 750 pages.
Our series takes a look at Chapter 11: XML Tools for Information Appliances.
MinML
MinML, presumably standing for minimal XML, can be downloaded from http://www.wilson.co.uk/xml/minml.htm. It is the smallest parser reviewed in this book, and the fastest one mentioned in this chapter. It is also SAX 1.0 compliant, and consumes less memory than NanoXML.
However, it does not offer a pull parsing mechanism. Parsing is only available through the SAX 1.0 interface, which "pushes" events into your application code (see Push, Pull, and Object Model Parsing.)
After telling the parser to begin, the MinML calls back (or pushes) into your application code to notify you of parse events. This model forces your code to maintain state information within the callback class(es), and to evaluate that state at each event. This is less programmer-friendly than pull parsers like kXML and XPP, but it's hard to argue with MinML's raw speed. For benchmarks see
http://www.extreme.indiana.edu/~aslom/exxp/. Perhaps MinML wouldn't be as fast if we account for the state information which application code must maintain that is maintained for us automatically by pull parsers.
What's Supported, What's Not Supported
|
Feature |
Supported |
Notes |
|
Document validation |
No |
DTDs are read but ignored |
|
Well-formed XML only |
Yes |
Throws org.xml.sax.SAXException if not well-formed |
|
Mixed content |
No |
Throws org.xml.sax.SAXException |
|
Entity expansion |
No |
Throws org.xml.sax.SAXException if predefined or general entities are used in an XML document. Parameter entities in the DTD are OK. |
|
SAX |
Yes, SAX 1.0 |
|
|
DOM |
No |
|
|
Comments |
Ignored |
|
|
Processing Instructions |
Ignored |
|
|
Namespaces |
Indirectly |
Prefixes aren't distinguished from local parts <prefix:name> becomes an atomic element or attribute |
|
Document Locator |
No |
Provides document line and column information |
|
JAR size |
|
15.3KB |
As stated above, SAX 1.0 is implemented in its bare-bones state. Locales aren't supported. Warnings and errors aren't supported; all errors and warnings are reported as fatalerrors. Public and system identifiers aren't supported.
Document locators, however, are supported. Document locators allow your application to locate the line and column number that triggered the SAX callback.
There is no support for entities, but parameter entities in a DTD are okay. Although this is a non-validating parser, DTDs are allowed (they are simply ignored). Processing instructions, although also ignored, don't throw exceptions like they do in NanoXML. Finally, ignorable whitespace is not reported to the application.
There is no pull-parsing mechanism, even as an add-on. This is a strict no-nonsense SAX 1.0 push parser, which will require you to track all state while documents are parsed.
Finally, and perhaps most significantly, there is no way to build a document. There is no interface that can build and output a document tree. If you require more than just parsing in your application, MinML won't be enough for you.
MinML provides no way to natively build and output documents: there is no object model, or element/node class from which to build documents
Package uk.co.wilson.xml
In true minimalist fashion, this package contains only one class MinML. However, let's briefly look at it.
Class MinML
uk.co.wilson.xml
|
public class MinML
extends java.lang.Object
implements
uk.org.xml.sax.Parser,
uk.org.xml.sax.DocumentHandler,
org.xml.sax.Locator,
org.xml.sax.ErrorHandler |
Although this is the only class in its package, it implements and uses many of the SAX 1.0 interfaces and classes. Those interfaces and classes must be distributed with MinML and should be in the CLASSPATH variable.
This class can be used in one of two ways:
- Extending it with your own class and overriding the SAX methods in which you are interested
- Creating an instance of the class, calling setDocumentHandler() on the instance, and calling its parse() method with an org.xml.sax.InputSource object or a java.io.Reader object
You will notice that class MinML implements uk.org.xml.sax.Parser and uk.org.xml.sax.DocumentHandler instead of the org.xml.sax.Parser and org.xml.sax.DocumentHandler. These two interfaces actually just extend their SAX counterparts and override only three methods. It is by overriding these methods that MinML implements one of its unique features: sending output to a java.io.Writer object.
SAX's DocumentHandler interface has two methods, startElement() and startDocument(), both of which return void. The versions in uk.org.xml.sax.DocumentHandler, however, return a java.io.Writer. By overriding these methods in your application and returning a Writer, MinML will write character data to the Writer object instead of calling back the application's characters() method.