kAWT
We won't be using kAWT in
this chapter, but it's worth mentioning briefly. It is a simplified,
lightweight version of the AWT API for the KVM. If you are going to do any
serious GUI development with the KVM, this is the best package currently
available. It also optionally includes some very useful I/O and networking classes.
One benefit of using the kAWT is that
applications developed with it will run under the AWT with J2SE (although the
converse isn't true). Currently, there are ports for Palm OS, IBM's J9,
Blackberry RIM, and the MID Profile (See Profiles, page 574).
The disadvantages to using the kAWT are:
- Higher storage capacity is required (the Palm OS port
is a 178KB PQA, including UI, I/O, and all networking classes)
- Loading kAWT applications takes longer
- It's non-standard (although there currently is no standard CLDC GUI implementation for PDAs)
Parsers
In this section, we will concentrate
on two XML parsers for lightweight clients:
- NanoXML
– a slower parser with a DOM-style interface that offers document
generation and optional SAX 1.0 support
- MinML
– a lean, fast SAX 1.0 parser without support for document generation
In addition to these, you might also want to check out kXML,
TinyXML,
and XPP.
They won't be used in this chapter, but we'll talk about a few of their
features briefly in Push, Pull, and Object Model Parsing, below.
Each of these five parsers has advantages and disadvantages
with varying support for W3C recommendations and standards. We will examine
some of these issues in this section. For the two parsers we review in detail,
there is a table of features included in corresponding sections.
We'll also discuss the three different types of parsers: push, pull, and object model, and the advantages and
disadvantages of each in regard to lightweight clients.
After we've reviewed XML parsers for lightweight clients,
we'll use some of the technologies discussed in the J2ME section (page 571),
along with an XML parser, to create a peer-to-peer sample address book
application.
Push, Pull and Object Model Parsing
There are currently three types of XML parser:
- Push parsers
- Object model parsers
- Pull parsers
Although push and object model parsers are the most popular
and well known, they are not always the best type of parser for lightweight
clients. We'll discuss this further in the next section. This chart outlines
lightweight XML parsers and the models they implement. Note that some parsers
give the option of parsing documents using different models. For comparison, a
heavyweight parser, Xerces-J, has been included:
| Parser
|
Type
|
Description
|
| NanoXML
|
Push and Object
Model
|
Versions 1.x of this
lightweight DOM-style parser offer optional SAX 1.0 support.
http://nanoxml.sourceforge.net/
|
| MinML
|
Push
|
An incredibly small
parser offering SAX 1.0 support. http://www.wilson.co.uk/xml/minml.htm
|
| TinyXML
|
Push and Object
Model
|
Very small parser
that offers both DOM- and SAX-style interfaces. No support for generating
documents, just reading them.
http://www.gibaradunn.srac.org/tiny/index.shtml
CLDC/KVM port for the Palm OS available at: http://www.microjava.com/news/techtalk/tinyxml/
|
| XMLtp
|
Push
|
Offers a DOM-style tree
interface. For non-lightweight clients, it has the optional feature of
an element-style class that implements javax.swing.tree.MutableTreeNode
and javax.swing.tree.TreeNode.
This enables elements to be visualized directly by a javax.swing.JTree.
http://mitglied.tripod.de/xmltp/intro.html
|
| XParse-J
|
Object Model
|
Tiny parser that
"aspires to be the smallest Java XML parser on the planet",
XParse-J offers custom DOM-style parsing interface. Also a JavaScript
version. http://www.webreference.com/xml/tools/xparse-j.html
|
| kXML
|
Pull
|
Works
"out-of-the-box" with J2ME. Includes an XML writer and WAP Binary
XML support (WBXML), a binary encoding optimized for the mobile phone
Wireless Application Protocol standard. http://www.kxml.org/
|
| XPP
|
Pull
|
XPP is small (21KB
JAR) and fast and has both Java and C++ implementations. Supports namespaces
and mixed content. Uses very little memory during parsing. http://www.extreme.indiana.edu/soap/xpp/
|
| KVMJab XMLParser
|
Push
|
Works "out of
the box" with J2ME and the Java KVM. Only 5629 bytes, quite limited.
http://www.alsutton.com/xmlparser/index.html
|
| Xerces-J |
Push, Object
Model, and no pull but lazy parsing comes close.
|
A classic
heavyweight XML parser intended for servers and desktops. http://xml.apache.org/xerces-j/index.html
|
Now let's discuss the three XML parser models in more depth.
Push Parsers
Push parsers are the class of
XML parsers that publish a set of interfaces, implemented by applications,
through which the parser relays document information.
SAX is the most well known XML
push parser. After your application tells the SAX parser to begin parsing, the
parser calls back (or pushes) into the application code to notify the
application of parse events. This model forces application code to maintain
state within the callback class(es), and to evaluate that state at each
event. That means many class variables in the callback
class(es), as well as (possibly) getters and setters for those variables. This
isn't very developer-friendly as it creates a lot of extra work.
Additionally, SAX and most push parsers parse an entire
document at once. As soon as your code tells SAX to begin parsing a document,
the document is parsed in its entirety. For very large documents, this means
lots of state information must be maintained – causing a potentially
large memory footprint, not to mention all the wasted processing and battery
power that goes into parsing an entire document if not all of it is needed
(though not nearly as much as a DOM parser would require).
We'll examine push parser issues in more detail in the NanoXML section (page 583).
Object Model Parsers
Object
model parsers are that class of XML parsers that build in-memory
representations of XML documents using tree-like data structures. The most
popular are parsers conforming to DOM Level 1 and Level 2 specifications, but
others exist (for example, NanoXML).
Object model parsers, unlike push parsers, don't usually
require the developer to maintain document state during parsing, but they have
their own drawbacks on lightweight clients. Most lightweight object model XML
parsers keep an entire parsed document in memory all the time, until the parser
and its resources are garbage collected. Parsing a large document with this
kind of parser, even if only one node from the whole document is required,
always means occupying large chunks of memory. This approach isn't desirable on
lightweight clients since their memory is constrained.
Also, as with push parsers, all the object model parsers for
lightweights that I know about parse an entire document at once. As soon as
your code tells the parser to begin parsing, the entire document is parsed so
an in-memory object model can be built. As with push parsers, this wastes a lot
of processor and battery power if the entire document is not required.
Lazy Parsing
Some heavyweight object model
parsers offer lazy parsing, for example Xerces-J. Parsing lazily means that the
object model is built and stored in memory only as the calling application requests
a node. However, usually the entire ancestor-or-self axis (with respect to the
requested node) is stored in memory after the request. Certainly, the entire
ancestor-or-self axis must be parsed when a node is requested. This isn't
optimal for constrained devices, but it's better than the
"parse-and-store-it-all-at-once" approach taken by present-day
lightweight object model XML parsers.
At the time of writing, no object model XML parsers are
available for lightweights that use lazy parsing. Hopefully, this will change
in the near future.
Pull Parsers
A newer player in the world of
XML parsing, pull parsers aren't nearly as prevalent as SAX and DOM parsers.
kXML and XPP appear to be the only feasible contenders today.
Pull parsers are particularly useful for lightweight clients
because they parse only the minimal chunk of a document necessary when an
application requests the next piece of data. The application can process this
data at its leisure and then ask for the next piece, spurring the parser to
parse just another small chunk of the document. This is similar to the workings
of a java.io.Reader.
The benefits of this approach are:
- Processing and battery power are used when and only
when the application needs the next piece of data; the application maintains
control over its parsing needs
- Memory footprint is reduced; the parser only needs to
maintain minimal state information and a pointer to the current element
(although the document itself must remain in memory for as long as parsing
might continue, so it's to the application's advantage to acquire what it needs
quickly). An entire object model does not remain memory-resident.
Unfortunately, there is no standard interface yet for XML
pull parsing, like SAX for XML push parsing or DOM for XML object model
parsing. Therefore, although pull parsers sound great for lightweight clients,
they may be too immature for use today in production applications. Applications
would be stuck with the limitations of the parser selected by the development
team without a clear upgrade path. It may be difficult or impossible (without
rewriting the application) to change parsers in the future.
Let's briefly look at some code that demonstrates how pull
parsers work. This code is based on sample kXML code from http://www.microjava.com/news/techtalk/kxml/.
It outputs element names and document text. Note the use of recursion,
something atypical in applications using push and object model XML parsers.
public void traverse(Parser parser) throws Exception {
boolean end = false;
while (!end) {
//request next document event
Event event = parser.read();
switch (event.getType()) {
case START_TAG:
System.out.println("start: " + event.getName());
traverse(parser); //recursive call
break;
case END_TAG:
System.out.println("end: " + event.getName());
leave = true;
break;
case END_DOCUMENT:
leave = true;
break;
case TEXT:
System.out.println("text: " + event.getText());
break;
}
}
Coming Up Next: A Look at NanoXML