home news blogs forums events research newsletter whitepapers careers


Network Computing Network Computing Network Computing
HOT PICKS

IMMERSE YOURSELF:

SOA

  |

Data Center

  |

802.11n

  |

Data Privacy

  |
APO  |

Virtualization

  |

NAC

  |

Security

  |

Network Mgmt

  |

Enterprise Apps

  |

Storage & Servers



Netdesign Manual

Part 2

Java XML Programmers Reference

Chapter 11: XML Tools for Information Appliances


September 24, 2001


Brought to you by:





Check It Out!

Here is part 2 of a 4-part chapter excerpt from the book Java XML Programmers Reference by Mohammad Akif, Steven Brodhead, Andrei Cioroianu, James Hart, Eric Jung and Dave Writz; ISBN 1861005202; published July 2001, 750 pages.

Our series takes a look at Chapter 11: XML Tools for Information Appliances.

NanoXML

A NanoXML document is a tree of nanoxml.XMLElement objects. These correspond to the org.w3c.dom.Node interface in the DOM specification.

NanoXML does not implement the DOM interfaces. You build and retrieve document contents through a proprietary API, but an optional SAX 1.0 component exists for document retrieval. This API is covered in this chapter.

Originally written in April 2000, NanoXML has gone through a few iterations. The current release is 1.6.8. The next major release of NanoXML will be 2.0, and it is scheduled to be available in July 2001. The current beta release is promising, although it seems to have lost compatibility with 1.x releases. We will discuss both releases since they differ significantly in library size and features.

The web site for NanoXML is http://nanoxml.sourceforge.net/.

The source code is available under an open source license. The site is maintained by Marc De Scheemaecker, who is the author of the package. I have found him to be very responsive to support questions.

If your target platform is the Java KVM, the latest NanoXML won't do the job because it has dependencies on classes that are not included in the standard Java KVM.

Without a doubt, the greatest feature of NanoXML is also its smallest - its JAR file size. Its JAR file size is second only to MinML, but the size depends upon which version you use and whether or not you choose the optional SAX component. You can get away with XML parsing in as little as 6047 bytes!

Unfortunately, NanoXML suffers from some performance issues and memory usage problems, which we will discuss.

Current Release – Version 1.6.8

What's Supported, What's Not Supported, and What's Optional


Feature Supported Notes
Document validation No
Well-formed XML only Yes nanoxml.XMLParseException thrown if malformed
Mixed content No Creates bugs in the
internal document tree!
Entity expansion Yes Entities are specified in
the XMLElement constructor
as a hashtable of key-value pairs
SAX Yes, SAX 1.0
DOM No
Comments Ignored
Processing Instructions No PI in the preamble <?xml version="1.0"
encoding="UTF-8"?> is ignored; subsequent PIs
throw nanoxml.XMLParseException
Namespaces Indirectly Prefixes aren't distinguished
from local parts – <prefix:name>
becomes an atomic element or attribute
JAR size 6047 bytes; 8618 with SAX support


Version 1.6.8 is a non-validating parser. Any reference to a DTD or XML Schema is ignored, although there is support for entity expansion.

Mixed content isn't supported, for example:

<Request>ItemDetail
<ItemId>553</ItemId>
</Request>

will result in an incorrect internal document representation. XML namespaces aren't supported directly, although they won't cause any parsing difficulties. This SOAP envelope, for example, is parsed without problems:

<SOAP:Envelopexmlns:SOAP=
'http://schemas.xmlsoap.org/soap/envelope/'
xmlns:xsi=
'http://www.w3.org/1999/XMLSchema-instance'
xmlns:xsd='http://www.w3.org/1999/XMLSchema'
xmlns:SOAP-ENC='http://schemas.xmlsoap.org/soap/encoding/'
SOAP:encodingStyle=
'http://schemas.xmlsoap.org/soap/encoding/'>
</SOAP:Envelope>

The element <Envelope> is stored literally as <SOAP:Envelope> with no comprehension of the SOAP namespace prefix. It also contains five attributes: xmlns:SOAP, xmlns:xsi, xmlns:xsd, xmlns:SOAP-ENC, and SOAP:encodingStyle. Since document validation isn't supported, namespace URLs are not followed.

Comments are skipped by the parser and not stored internally. The first processing instruction in an XML document:

<?xml version="1.0" encoding="UTF-8"?>

is skipped and also not stored internally. Any subsequent processing instructions will throw a nanoxml.XMLParseException.

A SAX-compatible API can optionally be used with parsing (see Package nanoxml, page 586). If the SAX API is not used, retrieval of elements and attributes is through a completely proprietary API (see public class XMLElement, page 587).

Documents can also be built from scratch and written to any Writer object, or they can be modified using the addChild() and removeChild() methods.

The JAR file size of this release, excluding the optional SAX component, is 6047 bytes. Adding SAX functionality brings the library up to 8618 bytes. But this small size doesn't come without a price. As with most parsers reviewed in this chapter, NanoXML is not XML 1.0 compliant.

You have two choices for parsing: a DOM-style or SAX 1.0 interface. Both choices are multiple-pass parsers, iterating over the same document more than once in order to build an internal representation (this is true even of the SAX interface because it is built on top of the DOM-style interface). This negatively affects performance. Finally, even if the SAX parser is used, an entire document tree is built and kept in memory until the parser object is garbage collected. Not only does this lead to a large memory footprint when parsing large documents, but depending upon the garbage collection mechanism used by your VM, it may severely fragment the heap and prevent subsequent object creation. We discuss this issue in the Java KVM section (page 575).

Parsing large documents with this version of NanoXML may be inappropriate for lightweight clients. However, for relatively small documents, it could be just the thing.


PAGE: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | NEXT PAGE
 





Ready to take that job and shove it?

Function:

Keyword(s):

State:
SPONSOR
RECENT JOB POSTINGS
CAREER NEWS
Go beyond Google and get vertical. These specialized search sites will help you find the business information you need -- fast.

Ari Balogh was named to the post of chief technology officer as the companys for a "realignment" of employees.










InformationWeek U.S. IT Salary Survey 2008
Salaries for business technology professionals are falling. Here's what you need to know in order to make good hiring decisions and personal career choices. Purchase Today: $299
 
ROLLING RIGHT ALONG
Follow key Network Computing Reviews from conception to completion. This Week: Holistic APM.



Network Computing Reports Emerging Enterprise Podcast Series: Secrets to Success








TechSearch


Microsite of the Week


Powerful Information at Your Fingertips



techweb
Online Communities TechWebInformationWeekLight ReadingIntelligent EnterprisebMightyNetwork ComputingDark ReadingDigital LibraryWall Street & Technology
Byte & SwitchNo JitterInternet EvolutionLight Reading's Cable Digital NewsContentinopleUnStrungBank Systems & TechnologyAdvanced TradingInsurance & Technology
Face-to-Face Events
InteropWeb 2.0 ExpoWeb 2.0 SummitVoiceConBlack HatCSISoftwareEntrprise 2.0 ConferenceGTEC
Mobile Business Expo
InformationWeek 500 ConferenceBuy Side Trading XchangeBuy Side Trading SummitBank Executive SummitInsurance Executive SummitTelcoTVEthernet ExpoOptical Expo
Magazines  
InformationWeekWall Street & TechnologyInsurance & TechnologyBank Systems & TechnologyAdvanced TradingMSDNTechNetSmart EnterpriseThe Architecture JournalDatabase Magazine
 
Research & Analyst Services  
Heavy ReadingInformationWeek ReportsInformationWeek Analytics
 
   
   
App Infrastructure   |   Messaging & Collaboration   |   Network & Systems Mgmt   |   Network Infrastructure   |   Security  |   Storage & Servers   |   Wireless   |   Enterprise Apps
About Us  |  Contact Us  |  Site Map  |  Technology Marketing Solutions  |   Briefing Centers
Copyright © 2008  United Business Media Limited  |  Privacy Statement  |  Terms of Service  |  Your California Privacy Rights