NanoXML Version 2.0 Beta
The second major release of NanoXML isn't due to be released until July 2001. A beta release is available now, however. It lacks SAX 1.0 support and there is still no direct support for XML namespaces, but the author assures me that a 2.1 release will support both SAX 2.0 and namespaces. NanoXML 2.0 is quite different from 1.x, so the 1.x port for the Java KVM won't work with 2.0 yet. Hopefully, a KVM port will be made available.
The beta release increases the JAR size from version 1.6.8 from 6,047 bytes to over 20,000 bytes, a significant increase. So what do we gain with that extra size?
For a start, the classes are in a different package this time, net.n3.nanoxml, instead of nanoxml. We lose backwards compatibility with version 1.x due to this and also interface and class changes within the packages. If you use 2.x., your code will not be usable with 1.x, although there is planned support for a "lite" version of 2.0 that is almost compatible with version 1.6. There are some advantages to using 2.0, however.
Probably the most significant enhancement is that the parser is now a single-pass parser. Version 1.x releases were multiple-pass and their performance suffered because of it. Performance in 2.0 has significantly improved upon this aspect.
Version 2.0 Beta occupies less memory while parsing than version 1.6.7, but the memory requirements still scale linearly with the size of the document. All elements are saved internally as a tree of XMLElement objects, with each XMLElement object containing a java.util.Properties object to store element attributes. This are kept in memory until garbage collected. As we shall see in another section, this can lead to memory fragmentation depending upon the garbage collector in the virtual machine you are using.
Mixed content is now supported, for example:
<Request>ItemDetail
<ItemId>553</ItemId>
</Request>
but class XMLWriter has some peculiarities around it (see the Child Methods section, page 606). Although the parser is still non-validating, the DTD isn't completely ignored as it was in version 1.x. Except for the <!ATTLIST> declaration, other DTD declarations appear to work. Predefined, general, and parameter entities are all supported.
Predefined entities are still supported. Additionally, any character can be referred to by its numeric reference (for example, @ for @). Predefined entities need not be declared in a DTD.
General entities are macros for an XML document. They associate parsed text with a symbol and must be declared in the DTD. For example:
<!ENTITY copyright "ý FishHeads, Inc. 2001">
Referencing this general entity in an XML document that uses the DTD in which copyright is declared can be done like so:
<rights>©right;</rights>
A parser that recognizes general entities should expand the parsed text to:
<rights>ý FishHeads, Inc. 2001</rights>
Just like general entities, parameter entities act as macros and are declared in the DTD. However, unlike general entities, their use is limited to the DTD - they cannot be referenced in XML. Since NanoXML isn't a validating parser, parameter entities aren't very useful. Perhaps this is provided as an intermediary step towards making NanoXML a validating parser. In any case, parameter entities are declared with the ENTITY keyword, a percent sign, a name, and the replacement value.
For example:
<!ENTITY % requestParameters "name CDATA #REQUIRED">
Whenever the parser encounters requestParameters in the DTD, it will substitute the quoted string. Here's a usage example:
<!ATTLIST Request %requestParameters date CDATA #IMPLIED>
A parser that recognizes parameter entities should expand the above to:
<!ATTLIST Request name CDATA #REQUIRED date CDATA #IMPLIED>
Note that all parameter entities must be declared before they are referred to in a DTD. Interestingly enough, using parameter entities results in an XMLParseException, although they can be declared without any problems. Perhaps this will be fixed before a production release of NanoXML 2.0, but it's worth remembering in future.
Package net.n3.nanoxml
This package consists of four interfaces and nine classes. The interfaces, IXMLBuilder, IXMLParser, IXMLReader, and IXMLValidator, are all intended to allow you to plug your own code into NanoXML. You could write your own reader, for example, and by extending IXMLReader, it would then plug into the NanoXML framework. You might choose to do this if your data comes from an unconventional source, a Palm OS database for example.
We won't cover the interfaces in too much detail, as there are concrete classes that implement them. We'll cover those classes, StdXMLBuilder, StdXMLParser, StdXMLReader, and NonValidator instead.
Class XMLElement
net.n3.nanoxml
public class XMLElement
implements java.io.Serializable
This class, even though it existed in version 1.x, has changed significantly. Some methods have been removed, and some new ones have been added.