IBM Version 5 Universal Remote User Manual


 
18 The XML Files: Development of XML/XSL Applications Using WebSphere Studio
2.1 XML Processor (parser)
An XML Processor can either be a validating or non-validating parser. Both kinds
of parsers report violations on an XML document. According to the XML 1.0
specification:
http://www.w3.org/ TR/REC-xml#proc-types
Validating processors must, at user option, report violations of the
constraints expressed by the declarations in the DTD, and failures to fulfill the
validity constraints given in this specification. To accomplish this, validating
XML Processors must read and process the entire DTD and all external
parsed entities referenced in the document.
Non-validating processors are required to check only the document entity,
including the entire internal DTD subset, for well-formedness. While they are not
required to check the document for validity, they are required to process all the
declarations they read in the internal DTD subset and in any parameter entity
that they read. This is done up to the first reference to a parameter entity that
they do not read; that is to say, they must use the information in those
declarations to normalize attribute values, include the replacement text of internal
entities, and supply default attribute values. Except when standalone="yes", they
must not process entity declarations or attribute-list declarations encountered
after a reference to a parameter entity that is not read, since the entity may have
contained overriding declarations.
From the definition above, a validating parser must read the entire DTD and
check the XML document against it. A non-validatiing parser may not need the
DTD must still check the XML against default values for attributes. Both parsers
check for the well-formedness of the document.
Most parsers can be run in validating and non-validating mode. Validating of XML
documents is crucial in the development and testing stage of the software
development life cycle. However, running validation has a performance cost. In
production, when the reliability of the data of a system is already established, and
they are expected to have complex DTDs and XML Schemas, the validating can
be turned off. Some parsers are non-validating by default.
Parsers can be of two types: tree-based parsing or event-based parsing. These
will be further discussed in Chapter 3, however, here is an overview:
Tree-based parsing
In tree-based parsing, the parsers attempts to create an hierarchal structure for
the entire document. For a hugh document, this will be extremely
memory-sensitive. The parser will make the elements and attributes available