W3C Document Object Model (DOM)
The W3C XML DOM Objects
- Element - An Element
- Attribute - An attribute
- Text - Text content of an element or attribute
- CDATAsection - CDATA section
- EntityReference - Reference to an entity
- Entity - Indication of a parsed or unparsed entity
- ProcessingInstruction - A processing instruction
- Comment - Content of an XML comment
- Document - The Document object
- DocumentType - Reference to the
<!DOCTYPE> element
- DocumentFragment - Reference to a fragment of a document
- Notation - Holder for a notation
Node-related Objects
- Node - A single node in the document tree.
- NodeList - A list of node objects.
- NamedNodeMap - Allows interaction and access by name to the collection
of attributes
XML Document as a Tree of Nodes
<?xml version="1.0" encoding="UTF-8"?>
<DOCUMENT>
<GREETING>Hello from XML</GREETING>
<MESSAGE>Welcome to Programing XML in Java</MESSAGE>
</DOCUMENT>
Levels of DOM
- Level 0 - Early versions of DOM in web browsers
- Level 1 - It concentrates on the HTML and XML document models:
- Level 2 - It defines a set of objects and interfaces for accessing
and manipulating document object. It includes a style sheet object model.
- Level 3 - It will address document loading and saving, as well as content
models (such as DTDs and schemas) with document validation support.
DOM XML Parsers
Working with DOM
Problem: Print the <students> element count of this XML file:
<?xml version="1.0"?>
<course>
<name id="csci_2962">Programming XML in Java</name>
<teacher id="jp">
<name>John Punin</name>
</teacher>
<student id="js">
<name>John Smith</name>
<hw1>30</hw1>
<hw2>70</hw2>
<project>80</project>
<final>85</final>
</student>
<student id="gl">
<name>George Lucas</name>
<hw1>80</hw1>
<hw2>90</hw2>
<project>100</project>
<final>40</final>
</student>
<student id="lr">
<name>Elizabeth Roberts</name>
<hw1>60</hw1>
<hw2>95</hw2>
<project>50</project>
<final>90</final>
</student>
</course>
Using Xerces XML Parser
import org.w3c.dom.*;
import org.apache.xerces.parsers.DOMParser;
public class DOMCountNames
{
public static void main(String[] args)
{
try{
DOMParser parser = new DOMParser();
parser.parse(args[0]);
Document doc = parser.getDocument();
.
.
.
}
catch(Exception e){
e.printStackTrace(System.err);
}
}
}
The DOMParser Class
java.lang.Object
|
+--org.apache.xerces.framework.XMLParser
|
+--org.apache.xerces.parsers.DOMParser
-
parse() method parses the input source given by a system identifier
- Document
getDocument() method returns the document itself
Document Interface Methods
-
Attr createAttribute(String name)
Creates an attribute of the given name
-
Element createElement(String tagName)
Creates an element of the type given
-
Text createTextNode(String data)
Creates a Text Node
-
Element getDocumentElement()
Gets the root element of the document
-
Element getElementById(String elementId)
Get the element with the given ID
-
NodeList getElementsByTagName(String tagname)
Returns a NodeList of all the elements with a given tag name
NodeList Interface Methods
-
int getLength()
Gets the number of nodes in this list
-
Node item(int index)
Gets the item at the specified index value in the collection
Using Xerces XML Parser (2)
import org.w3c.dom.*;
import org.apache.xerces.parsers.DOMParser;
public class DOMCountNames
{
public static void main(String[] args)
{
try{
DOMParser parser = new DOMParser();
parser.parse(args[0]);
Document doc = parser.getDocument();
NodeList nodelist = doc.getElementsByTagName("student");
System.out.println(args[0] + " has " + nodelist.getLength()
+ " <student> elements.");
}
catch(Exception e){
e.printStackTrace(System.err);
}
}
}
Compiling and Running DOMCountNames example
javac -classpath ":xerces.jar" DOMCountNames.java
java -classpath ":xerces.jar" DOMCountNames students.xml
students.xml has 3 <student> elements.
- xerces.jar, DOMCountNames.java, DOMCountNames.class and students.xml are in the same directory
Node Interface Methods
NamedNodeMap getAttributes()
Gets a NamedNodeMap containing the attributes of this node
NodeList getChildNodes()
Gets a NodeList that contains all children of this node
String getLocalName()
Gets the local name of the node
String getNodeName()
Gets the name of this node
String getNodeValue()
Gets the value of this node
Node getParentNode()
Gets the parent of this node
short getNodeType()
Gets a code representing the type of the node
Node Types
static short ATTRIBUTE_NODE
static short CDATA_SECTION_NODE
static short COMMENT_NODE
static short DOCUMENT_FRAGMENT_NODE
static short DOCUMENT_NODE
static short DOCUMENT_TYPE_NODE
static short ELEMENT_NODE
static short ENTITY_NODE
static short ENTITY_REFERENCE_NODE
static short NOTATION_NODE
static short PROCESSING_INSTRUCTION_NODE
static short TEXT_NODE
Traversing an Entire Document
Problem: Print a list of the Elements of this XML file
<?xml version="1.0"?>
<course>
<name id="csci_2962">Programming XML in Java</name>
<teacher id="jp">
<name>John Punin</name>
</teacher>
<student id="js">
<name>John Smith</name>
<hw1>30</hw1>
<hw2>70</hw2>
<project>80</project>
<final>85</final>
</student>
<student id="gl">
<name>George Lucas</name>
<hw1>80</hw1>
<hw2>90</hw2>
<project>100</project>
<final>40</final>
</student>
<student id="lr">
<name>Elizabeth Roberts</name>
<hw1>60</hw1>
<hw2>95</hw2>
<project>50</project>
<final>90</final>
</student>
</course>
Traversing an Entire Document (2)
import org.w3c.dom.*;
import org.apache.xerces.parsers.DOMParser;
class DisplayElements
{
public static void displayDocument(String uri)
{
try{
DOMParser parser = new DOMParser();
parser.parse(uri);
Document doc = parser.getDocument();
display_names(doc);
}
catch(Exception e){
e.printStackTrace(System.err);
}
}
public static void display_names(Node node)
{
.
.
.
}
}
public class DOMNameElements
{
public static void main(String[] args)
{
DisplayElements.displayDocument(args[0]);
}
}
Traversing an Entire Document - Using Recursion (3)
Handling Document Nodes:
public static void display_names(Node node)
{
if(node == null) {
return;
}
int type = node.getNodeType();
switch (type) {
case Node.DOCUMENT_NODE: {
display_names(((Document)node).getDocumentElement());
break;
}
case Node.ELEMENT_NODE: {
.
.
.
break;
}
}
}
Traversing an Entire Document - Using Recursion (4)
Handling Element Nodes:
case Node.ELEMENT_NODE: {
System.out.println("Element : " + node.getNodeName());
NodeList childNodes = node.getChildNodes();
if(childNodes != null) {
int length = childNodes.getLength();
for (int loopIndex = 0; loopIndex < length ; loopIndex++)
{
display_names(childNodes.item(loopIndex));
}
}
break;
}
Putting all together
import org.w3c.dom.*;
import org.apache.xerces.parsers.DOMParser;
class DisplayElements
{
public static void displayDocument(String uri)
{
try{
DOMParser parser = new DOMParser();
parser.parse(uri);
Document doc = parser.getDocument();
display_names(doc);
}
catch (Exception e) {
e.printStackTrace(System.err);
}
}
public static void display_names(Node node)
{
if(node == null) {
return;
}
int type = node.getNodeType();
switch (type) {
case Node.DOCUMENT_NODE: {
display_names(((Document)node).getDocumentElement());
break;
}
case Node.ELEMENT_NODE: {
System.out.println("Element : " + node.getNodeName());
NodeList childNodes = node.getChildNodes();
if(childNodes != null) {
int length = childNodes.getLength();
for (int loopIndex = 0; loopIndex < length ; loopIndex++)
{
display_names(childNodes.item(loopIndex));
}
}
break;
}
}
}
}
public class DOMNameElements
{
public static void main(String[] args)
{
DisplayElements.displayDocument(args[0]);
}
}
Compiling and Running DOMNameElements example
javac -classpath ":xerces.jar" DOMNameElements.java
java -classpath ":xerces.jar" DOMNameElements students.xml
Element : course
Element : name
Element : teacher
Element : student
Element : name
Element : hw1
Element : hw2
Element : project
Element : final
Element : student
Element : name
Element : hw1
Element : hw2
Element : project
Element : final
Element : student
Element : name
Element : hw1
Element : hw2
Element : project
Element : final
- xerces.jar, DOMNameElements.java, DOMNameElements.class
and students.xml are in the same directory
Working with DOM (Handling Nodes)
Problem: Print a list of Students Average and Course Average
<?xml version="1.0"?>
<course>
<name id="csci_2962">Programming XML in Java</name>
<teacher id="jp">
<name>John Punin</name>
</teacher>
<student id="js">
<name>John Smith</name>
<hw1>30</hw1>
<hw2>70</hw2>
<project>80</project>
<final>85</final>
</student>
<student id="gl">
<name>George Lucas</name>
<hw1>80</hw1>
<hw2>90</hw2>
<project>100</project>
<final>40</final>
</student>
<student id="lr">
<name>Elizabeth Roberts</name>
<hw1>60</hw1>
<hw2>95</hw2>
<project>50</project>
<final>90</final>
</student>
</course>
Creating DOMParser and Document
import org.w3c.dom.*;
import org.apache.xerces.parsers.DOMParser;
class Grades
{
public static void computeGrades(String uri)
{
try{
DOMParser parser = new DOMParser();
parser.parse(uri);
Document doc = parser.getDocument();
traverse_tree(doc);
compute_final_grades();
}
catch(Exception e){
e.printStackTrace(System.err);
}
}
.
.
.
}
public class DOMGrades
{
public static void main(String[] args)
{
Grades.computeGrades(args[0]);
}
}
Handling Document Node, Element Node and Text Node
static float grades[][] = new float[100][5];
static int nstudent = 0;
static int gi = -1;
private static void traverse_tree(Node node)
{
if(node == null) {
return;
}
int type = node.getNodeType();
switch (type) {
case Node.DOCUMENT_NODE: {
traverse_tree(((Document)node).getDocumentElement());
break;
}
case Node.ELEMENT_NODE: {
String elementName = node.getNodeName();
gi = -1;
if(elementName.equals("hw1"))
gi = 0;
else if(elementName.equals("hw2"))
gi = 1;
else if(elementName.equals("project"))
gi = 2;
else if(elementName.equals("final"))
gi = 3;
else if(elementName.equals("student"))
nstudent++;
NodeList childNodes = node.getChildNodes();
if(childNodes != null) {
int length = childNodes.getLength();
for (int loopIndex = 0; loopIndex < length ; loopIndex++)
{
traverse_tree(childNodes.item(loopIndex));
}
}
break;
}
case Node.TEXT_NODE: {
String chData = node.getNodeValue().trim();
if(chData.indexOf("\n") < 0 && chData.length() > 0) {
if(gi >= 0)
grades[nstudent-1][gi] = Integer.parseInt(chData);
}
}
}
}
Computing grades after tree traversal
static float grades[][] = new float[100][5];
static int nstudent = 0;
private static void compute_final_grades()
{
float Ave = 0;
int i = 0, j = 0;
System.out.println("Grades");
for(i = 0; i < nstudent ; i++)
{
float total = 0;
for(j = 0; j < 4; j++) {
total += grades[i][j];
}
grades[i][4] = total/4;
Ave += grades[i][4];
System.out.println("Student " + i + "=" + grades[i][4]);
}
Ave /= nstudent;
System.out.println("Class Average =" + Ave);
}
Putting all together
import org.w3c.dom.*;
import org.apache.xerces.parsers.DOMParser;
class Grades
{
static float grades[][] = new float[100][5];
static int nstudent = 0;
static int gi = -1;
public static void computeGrades(String uri)
{
try {
DOMParser parser = new DOMParser();
parser.parse(uri);
Document doc = parser.getDocument();
traverse_tree(doc);
compute_final_grades();
} catch (Exception e) {
e.printStackTrace(System.err);
}
}
private static void compute_final_grades()
{
float Ave = 0;
int i = 0, j = 0;
System.out.println("Grades");
for(i = 0; i < nstudent ; i++)
{
float total = 0;
for(j = 0; j < 4; j++) {
total += grades[i][j];
}
grades[i][4] = total/4;
Ave += grades[i][4];
System.out.println("Student " + i + "=" + grades[i][4]);
}
Ave /= nstudent;
System.out.println("Class Average =" + Ave);
}
private static void traverse_tree(Node node)
{
if(node == null) {
return;
}
int type = node.getNodeType();
switch (type) {
case Node.DOCUMENT_NODE: {
traverse_tree(((Document)node).getDocumentElement());
break;
}
case Node.ELEMENT_NODE: {
String elementName = node.getNodeName();
gi = -1;
if(elementName.equals("hw1"))
gi = 0;
else if(elementName.equals("hw2"))
gi = 1;
else if(elementName.equals("project"))
gi = 2;
else if(elementName.equals("final"))
gi = 3;
else if(elementName.equals("student"))
nstudent++;
NodeList childNodes = node.getChildNodes();
if(childNodes != null) {
int length = childNodes.getLength();
for (int loopIndex = 0; loopIndex < length ; loopIndex++)
{
traverse_tree(childNodes.item(loopIndex));
}
}
break;
}
case Node.TEXT_NODE: {
String chData = node.getNodeValue().trim();
if(chData.indexOf("\n") < 0 && chData.length() > 0) {
if(gi >= 0)
grades[nstudent-1][gi] = Integer.parseInt(chData);
}
}
}
}
}
public class DOMGrades
{
public static void main(String[] args)
{
Grades.computeGrades(args[0]);
}
}
Compiling and Running DOMGrades example
- javac -classpath ":xerces.jar" DOMGrades.java
- java -classpath ":xerces.jar" DOMGrades students.xml
Grades
Student 0=66.25
Student 1=77.5
Student 2=73.75
Class Average =72.5
- xerces.jar, DOMGrades.java, DOMGrades.class and students.xml are in
the same directory
Working with DOM (Using Attributes)
Attr Interface Methods:
String getName()
Gets the name of this attribute
Element getOwnerElement()
Gets the Element node to which this attribute is attached
String getValue()
Gets the value of the attribute as a string
NamedNodeMap Interface Methods:
int getLength()
Returns the number of nodes in this map
Node getNamedItem(String name)
Gets a node indicated by name
Node item(int index)
Gets an item in the map by index
Working with DOM (Using Attributes) (2)
Problem: Compute the area of each of these figures
<?xml version="1.0"?>
<figures>
<circle x="20" y="10" r="20"/>
<rectangle x="-3" y="4" w="5" h="36"/>
<ellipse x="-5" y="6" w="30" h="50"/>
<rectangle x="7" y="23" w="58" h="45"/>
<circle x="-2" y="5" r="35"/>
<ellipse x="-10" y="-8" w="45" h="30"/>
</figures>
Working with DOM (Using Attributes) (3)
case Node.ELEMENT_NODE:
{
String elementName = node.getNodeName();
NamedNodeMap attrs = node.getAttributes();
if(elementName.equals("circle")) {
Attr attrib = (Attr)attrs.getNamedItem("r");
String sr = attrib.getValue();
float radius = Float.valueOf(sr).floatValue();
float area = (float)Math.PI*radius*radius;
System.out.println("Circle : Radius = " + radius +
" Area = " + area);
}
}
Putting all together
import org.w3c.dom.*;
import org.apache.xerces.parsers.DOMParser;
class Figures
{
public static void computeArea(String uri)
{
try{
DOMParser parser = new DOMParser();
parser.parse(uri);
Document doc = parser.getDocument();
traverse_tree(doc);
}
catch (Exception e) {
e.printStackTrace(System.err);
}
}
private static void traverse_tree(Node node)
{
if(node == null) {
return;
}
int type = node.getNodeType();
switch (type) {
case Node.DOCUMENT_NODE: {
traverse_tree(((Document)node).getDocumentElement());
break;
}
case Node.ELEMENT_NODE: {
String elementName = node.getNodeName();
NamedNodeMap attrs = node.getAttributes();
if(elementName.equals("circle")) {
Attr attrib = (Attr)attrs.getNamedItem("r");
String sr = attrib.getValue();
float radius = Float.valueOf(sr).floatValue();
float area = (float)Math.PI*radius*radius;
System.out.println("Circle : Radius = " + radius + " Area = " + area);
}
else if(elementName.equals("rectangle")) {
Attr attrib = (Attr)attrs.getNamedItem("w");
String sw = attrib.getValue();
attrib = (Attr)attrs.getNamedItem("h");
String sh = attrib.getValue();
float width = Float.valueOf(sw).floatValue();
float height = Float.valueOf(sh).floatValue();
float area = width * height;
System.out.println("Rectangle : Width = " + width +
" Height = " + height + " Area = " + area);
}
else if(elementName.equals("ellipse")) {
Attr attrib = (Attr)attrs.getNamedItem("w");
String sw = attrib.getValue();
attrib = (Attr)attrs.getNamedItem("h");
String sh = attrib.getValue();
float width = Float.valueOf(sw).floatValue();
float height = Float.valueOf(sh).floatValue();
float area = (float)Math.PI*(width/2)*(height/2);
System.out.println("Ellipse : Width = " + width +
" Height = " + height + " Area = " + area);
}
NodeList childNodes = node.getChildNodes();
if(childNodes != null) {
int length = childNodes.getLength();
for (int loopIndex = 0; loopIndex < length ; loopIndex++)
{
traverse_tree(childNodes.item(loopIndex));
}
}
break;
}
}
}
}
public class DOMFigures
{
public static void main(String[] args)
{
Figures.computeArea(args[0]);
}
}
Compiling and Running DOMFigures example
javac -classpath ":xerces.jar" DOMFigures.java
java -classpath ":xerces.jar" DOMFigures figures.xml
Circle : Radius = 20.0 Area = 1256.6371
Rectangle : Width = 5.0 Height = 36.0 Area = 180.0
Ellipse : Width = 30.0 Height = 50.0 Area = 1178.0973
Rectangle : Width = 58.0 Height = 45.0 Area = 2610.0
Circle : Radius = 35.0 Area = 3848.4512
Ellipse : Width = 45.0 Height = 30.0 Area = 1060.2876
- xerces.jar, DOMFigures.java, DOMFigures.class and figures.xml are in
the same directory