home
NEWS       BLOGS       FORUMS       NEWSLETTERS       RESEARCH       EVENTS       DIGITAL LIBRARY       CAREERS  
Network Computing Network Computing Powered by InformationWeek Business Technology Network

IMMERSE YOURSELF:

SOA

  |

Data Center

  |

802.11n

  |

Data Privacy

  |
APO  |

Virtualization

  |

NAC

  |

Security

  |

Network Mgmt

  |

Enterprise Apps

  |

Storage & Servers



Netdesign Manual

Part 2

Java XML Programmers Reference

Chapter 11: XML Tools for Information Appliances


September 24, 2001


Brought to you by:





Check It Out!

Class XMLParseException
nanoxml

public class XMLParseException
extends RuntimeException

This class is usually thrown when a non-well-formed document is parsed or a processing instruction that isn't in the preamble is encountered.

This class represents a NanoXML parsing exception. It extends java.lang.RuntimeException. Even though processing instructions that aren't in a document's preamble can certainly be part of a well-formed XML document, NanoXML doesn't like it and will throw an exception.

Package nanoxml.sax

Adding the optional SAX 1.0 parser to NanoXML increases the library's size by another 2,571 bytes (for a total of 8,618 bytes). This is quite small, but it also increases your dependencies. For example, the package makes use of java.net.URL, java.io.InputStream, and java.util.Locale among others. Depending upon your particular virtual machine and device profile, some or all of these classes may not be available. You might be able to get creative and rewrite some of the package if you want to reduce its dependencies as was done for the Java KVM.

In addition to possibly not having all required classes, SAX is a push parser. After telling the parser to begin, the parser calls back (or pushes) into your application code to notify you of parse events. This model forces your code to maintain state within the callback class(es), and to evaluate that state at each event.

One of the nice things we've seen in the previous code examples is that there was no need for state information. This is much more programmer-friendly than the code we are about to see.

In SAX's defense, however, it will allow you to plug another parser underneath the hood without any code changes on your part. It's a standardized API. All you need to do is use different class files or a different JAR. If you're seeing performance or memory usage problems with NanoXML, this will allow you to plug another parser into your application without much work. However, you might be better off ignoring the SAX standard and using the pull model of parsing, such as that used by kXML and XPP.

Unfortunately, a standard pull API for XML parsing has yet to be decided upon, so if you choose a pull parser, your upgrade path is unclear.

Class SAXParser
nanoxml.sax

public class SAXParser
implements org.xml.sax.Parser

This class implements the org.xml.sax.Parser interface published by David Megginson. It's built on top of the class XMLElement so it has all the features (or lack thereof) outlined in the Features table on page 584. Here is a list of other features applicable to this particular parser:

Feature Support for org.xml.sax.Parser Notes
Locales English language only SAXException thrown
if another type of local
is set with setLocale()
Whitespace ignorableWhiteSpace() is never called Leading whitespace in #PCDATA skipped
DTD validation None The objects implementing interface org.xml.sax.DTDHandler
and interface org.xml.sax.EntityResolver
in your application are never called back
Mixed content None XML such as <Request>widgets<Item>553<
/Item></Request> isn't permitted
Document locator Support for line numbers and system identifiers org.xml.sax.Locator.getLineNumber()
and org.xml.sax.Locator.getSystemId() are supported
Processing instructions processingInstruction () is never called  
Additionally, this parser only supports locales using the English language. It will throw a SAXException if another type of locale is set using the setLocale() method. Attribute data types are always reported as CDATA.

Since SAXParser makes use of the nanoxml.XMLElement class internally, it has to choose one of the XMLElement() constructors to use. These constructors dictate certain parsing behaviors (see the section public class XMLElement, page 587). The default parsing behavior is case insensitivity to element and attribute names, to skip leading whitespace in PCDATA elements, and to expand only the entities &, <, &go;, ', and ". However, this behavior can be overridden by deriving your own class from SAXParser and implementing its createTopElement() protected method to call a different XMLElement() constructor.

Error handlers and document locators are supported, as well as parsing from a URI.

Usage and Examples

Let's look at what it would take to implement one of our previous examples using the SAX interface. This will give you a good idea about what I mean by having to maintain state in your application for push parsers like SAX.

This example reads the XML document from the section Attribute Methods (page 608), request.xml, and outputs the #PCDATA and type attribute value for each ItemId element. Recall the XML document looks like this:

<Request name="ItemDetail">
<Parameters>
<ItemId type="Integer">553</ItemId>
<ItemId type="Integer">554</ItemId>
</Parameters>
</Request>

Remember, we want the same output that the previous code (which used NanoXML) produced. To refresh your memory, the output was:

Type = Integer and item id = 553
Type = Integer and item id = 554

Here is the code that uses SAX. You'll have to put David Megginson's sax.jar for SAX 1.0 (http://www.megginson.com/SAX/SAX1/index.html) in your CLASSPATH, as well as nanoxml-sax.jar.

import nanoxml.sax.SAXParser;
import org.xml.sax.*;
public class RequestHandler extends HandlerBase {
private String _type;
public RequestHandler () throws Exception {
SAXParser parser = new SAXParser();
parser.setDocumentHandler(this);
parser.setErrorHandler(this);
parser.parse("request.xml");
}
public void startElement(String name, 
AttributeList attrs) throws SAXException {
if (name.equals("ItemId")) {
if (attrs.getValue("TYPE") == null)
_type = "???";
else
_type = attrs.getValue("TYPE");
}
}
public void characters(char ch[], int start, 
int length) throws SAXException {
System.out.print("Type = " +_type + " and item id = ");
System.out.println(ch);
}
public static void main(String args[]) throws Exception {
RequestHandler t = new RequestHandler ();
}
}
Notice the private member variable type that saves the value of the type attribute for the element currently being parsed. There is no other way to implement this in SAX. This is a small example, too. For more complex parsing, the amount of state needing to be saved increases.

This code is also quite a bit larger than the code that used nanoxml.XMLElement. SAX just isn't as programmer-friendly.


PAGE: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | NEXT PAGE
 





Ready to take that job and shove it?

Function:

Keyword(s):

State:
SPONSOR
RECENT JOB POSTINGS
CAREER NEWS
Go beyond Google and get vertical. These specialized search sites will help you find the business information you need -- fast.

Ari Balogh was named to the post of chief technology officer as the companys for a "realignment" of employees.










InformationWeek U.S. IT Salary Survey 2008
Salaries for business technology professionals are falling. Here's what you need to know in order to make good hiring decisions and personal career choices. Download Today
 
ROLLING RIGHT ALONG
Follow key Network Computing Reviews from conception to completion. This Week: Holistic APM.



Network Computing Reports Emerging Enterprise Podcast Series: Secrets to Success








TechSearch


Microsite of the Week


Powerful Information at Your Fingertips



InformationWeek Business Technology Network
InformationWeekInformationWeek 500InformationWeek 500 ConferenceInformationWeek AnalyticsInformationWeek CIO
InformationWeek EventsInformationWeek ReportsInformationWeek MagazinebMightyByte and SwitchDark Reading
Digital LibraryIntelligent EnterpriseInternet EvolutionNetwork ComputingNo Jitter
space
Techweb Events Network
InteropVoiceConWeb 2.0 ExpoWeb 2.0 SummitEnterprise 2.0 ConferenceMobile Business ExpoSoftware ConferenceCSI - Computer Security Institute
Black HatGTECEnergy CampMashup CampStartup Camp
space
Light Reading Communications Network
Light ReadingLight Reading EuropeUnstrungLight Reading's Cable Digital NewsConstantinopleInternet Evolution
Heavy ReadingLight Reading Live!Light Reading InsiderEthernet ExpoOptical ExpoTeleco TVTower Technology Summit
space
Financial Technology Network
Advanced TradingBank Systems & TechnologyInsurance & TechnologyWall Street & TechnologyAccelerating Wall StreetBank Systems & Technology Executive SummitBuyside Trading SummitInsurance & Technology Executive Summit
space
Microsoft Technology Network
MSDN MagazineTechNetThe Architecture Journal
space
App Infrastructure   |   Messaging & Collaboration   |   Network & Systems Mgmt   |   Network Infrastructure   |   Security  |   Storage & Servers   |   Wireless   |   Enterprise Apps
About Us  |  Contact Us  |  Site Map  |  Technology Marketing Solutions  |  Advertising Contacts  |   Briefing Centers
Copyright © 2008  United Business Media LLC  |  Privacy Statement  |  Terms of Service  |  Your California Privacy Rights