Upcoming Events

Executive conference

Cloud Connect March 16-18

Comprehensive thought leadership for executives, IT professionals and developers. Topics include: the ROI, cost and economics of on-demand computing; Migration strategies to move from on-premise to cloud-based IT; Vertical cloud specialization, tailoring features and architectures to specific applications, industries, and customer ecosystems

More Events »

Subscribe to Newsletter

  • Keep up with all of the latest news and analysis on the fast-moving IT industry with Network Computing newsletters.
Sign Up

Netdesign Manual

Part 3

Java XML Programmers Reference

Chapter 11: XML Tools for Information Appliances


October 8, 2001


Brought to you by:





Check It Out!

Here is part 3 of a 4-part chapter excerpt from the book Java XML Programmers Reference by Mohammad Akif, Steven Brodhead, Andrei Cioroianu, James Hart, Eric Jung and Dave Writz; ISBN 1861005202; published July 2001, 750 pages.

Our series takes a look at Chapter 11: XML Tools for Information Appliances.

MinML

MinML, presumably standing for minimal XML, can be downloaded from http://www.wilson.co.uk/xml/minml.htm. It is the smallest parser reviewed in this book, and the fastest one mentioned in this chapter. It is also SAX 1.0 compliant, and consumes less memory than NanoXML.

However, it does not offer a pull parsing mechanism. Parsing is only available through the SAX 1.0 interface, which "pushes" events into your application code (see Push, Pull, and Object Model Parsing.)

After telling the parser to begin, the MinML calls back (or pushes) into your application code to notify you of parse events. This model forces your code to maintain state information within the callback class(es), and to evaluate that state at each event. This is less programmer-friendly than pull parsers like kXML and XPP, but it's hard to argue with MinML's raw speed. For benchmarks see http://www.extreme.indiana.edu/~aslom/exxp/. Perhaps MinML wouldn't be as fast if we account for the state information which application code must maintain that is maintained for us automatically by pull parsers.


What's Supported, What's Not Supported

Feature

Supported

Notes

Document validation

No

DTDs are read but ignored

Well-formed XML only

Yes

Throws org.xml.sax.SAXException if not well-formed

Mixed content

No

Throws org.xml.sax.SAXException

Entity expansion

No

Throws org.xml.sax.SAXException if predefined or general entities are used in an XML document. Parameter entities in the DTD are OK.

SAX

Yes, SAX 1.0

 

DOM

No

 

Comments

Ignored

 

Processing Instructions

Ignored

 

Namespaces

Indirectly

Prefixes aren't distinguished from local parts — <prefix:name> becomes an atomic element or attribute

Document Locator

No

Provides document line and column information

JAR size

 

15.3KB

As stated above, SAX 1.0 is implemented in its bare-bones state. Locales aren't supported. Warnings and errors aren't supported; all errors and warnings are reported as fatalerrors. Public and system identifiers aren't supported.

Document locators, however, are supported. Document locators allow your application to locate the line and column number that triggered the SAX callback.

There is no support for entities, but parameter entities in a DTD are okay. Although this is a non-validating parser, DTDs are allowed (they are simply ignored). Processing instructions, although also ignored, don't throw exceptions like they do in NanoXML. Finally, ignorable whitespace is not reported to the application.

There is no pull-parsing mechanism, even as an add-on. This is a strict no-nonsense SAX 1.0 push parser, which will require you to track all state while documents are parsed.

Finally, and perhaps most significantly, there is no way to build a document. There is no interface that can build and output a document tree. If you require more than just parsing in your application, MinML won't be enough for you.


MinML provides no way to natively build and output documents: there is no object model, or element/node class from which to build documents


Package uk.co.wilson.xml

In true minimalist fashion, this package contains only one class — MinML. However, let's briefly look at it.

Class MinML
uk.co.wilson.xml

public class MinML

extends java.lang.Object

implements

uk.org.xml.sax.Parser,

uk.org.xml.sax.DocumentHandler,

org.xml.sax.Locator,

org.xml.sax.ErrorHandler

Although this is the only class in its package, it implements and uses many of the SAX 1.0 interfaces and classes. Those interfaces and classes must be distributed with MinML and should be in the CLASSPATH variable.

This class can be used in one of two ways:

  • Extending it with your own class and overriding the SAX methods in which you are interested
  • Creating an instance of the class, calling setDocumentHandler() on the instance, and calling its parse() method with an org.xml.sax.InputSource object or a java.io.Reader object

You will notice that class MinML implements uk.org.xml.sax.Parser and uk.org.xml.sax.DocumentHandler instead of the org.xml.sax.Parser and org.xml.sax.DocumentHandler. These two interfaces actually just extend their SAX counterparts and override only three methods. It is by overriding these methods that MinML implements one of its unique features: sending output to a java.io.Writer object.

SAX's DocumentHandler interface has two methods, startElement() and startDocument(), both of which return void. The versions in uk.org.xml.sax.DocumentHandler, however, return a java.io.Writer. By overriding these methods in your application and returning a Writer, MinML will write character data to the Writer object instead of calling back the application's characters() method.


PAGE: 1 | 2 | 3 | 4 | 5 | 6 | NEXT PAGE
 

Best of the Web

Data deduplication: Declawing the clones

Data deduplication is emerging as a critically important new arrow in the storage administrator's quiver to answer hard questions about the increasing problem in storage growth costs.

Quick Read

Compression, Encryption, Deduplication, and Replication: Strange Bedfellows

One of the great ironies of storage technology is the inverse relationship between efficiency and security: Adding performance or reducing storage requirements almost always results in reducing the confidentiality, integrity, or availability of a system.

Quick Read

WAN Optimization Whitelists and Blacklists

Optimization is a fantastic way of saving money and creating really happy customers at the same time, but it doesn't work flawlessly for all applications.

Quick Read

WAN Optimization as a Managed Service: It's Not About the Cost

This insight examines how organizations outsourcing their WAN optimization initiatives to a third-party go about achieving their goals for application performance, reducing operational costs, and streamlining enterprise infrastructure.

Quick Read

  Sponsored Links

Premium Content

Data Centers Gone Wild
February 22, 2010

NWC


Salary

Video