Creating XML Documents
- HTML, about 100 elements
- XML, you define your own elements
- HTML Browsers try to fix bad HTML code
- XML Processors do not make any guess about the structure of the document
- Well-formed XML Document is the minimal requirement
- Valid XML Document (DTD or XML Schema)
- Group of member organizations interested in the WWW
- W3C is hosted at MIT
- Specifications for the Web
What is a Well-Formed XML Document?
A textual object is a well-formed XML Document if:
- Taken as a whole, it matches the production labeled document
- It meets all the well-formedness contraints given in this specification:
http://www.w3.org/TR/REC-xml
- Each of the parsed entities which is referenced directly or indirectly
whitin the document is well-formed
document ::= prolog element Misc*
- Prolog:
<?xml version="1.0"?>
- Comments ->
<!-- This is a Comment -->
- Processing Instructions:
<?xml-stylesheet href="JavaXML.html.xsl" type="text/xsl"?>
<?xml-stylesheet href="greeting.css" type="text/css"?>
- Element:
- Root Element contains more elements
- Exactly one root element
- Misc:
- Comments
- Processing Instructions
- Whitespaces
Entity
- Part of an XML Document
- Hold text or binary data
- May refer to other entities
- Parsed entities are character data
- Unparsed entities are binary data
Tags and Elements
- XML Element consists of a start tag and an end tag
<document> ... </document>
- Tag Names
- Start with a letter
<document>, an underscore <_record> or
a colon (avoid using a colon)
- Next characters may be letters, digits, underscore, hyphens,
periods and colons (but no whitespaces)
- XML Processors are case sensitive
Different tags: <document>, <DOCUMENT>,
<Document>
- Empty Elements have only one tag:
HTML : <img>, <li>, <hr>
XHTML : <img/>, <li/>, <hr/>
Attributes
- Name-value pairs: {STATUS, "Good Credit"}
- Specify additional data in start tags
<CUSTOMER STATUS="Good credit">
- Attribute Names same rules as tag names
- Attribute Values are strings enclosed in quotation marks
Elements vs Attributes
- Too many attributes make documents hard to read
- You can't specify document structure using attributes
- Attributes are good for simple information
Elements vs Attributes (2)
Too many attributes make documents hard to read:
<CUSTOMER LAST_NAME="Smith" FIRST_NAME="Sam"
DATE="October 15, 2001" PURCHASE="Tomatoes"
PRICE="$1.25" NUMBER="8" />
<CUSTOMER>
<NAME>
<LAST_NAME>Smith</LAST_NAME>
<FIRST_NAME>Sam</FIRST_NAME>
</NAME>
<DATE>October 15, 2001</DATE>
<ORDERS>
<ITEM>
<PRODUCT>Tomatoes</PRODUCT>
<NUMBER>8</NUMBER>
<PRICE>$1.25</PRICE>
</ITEM>
</ORDERS>
</CUSTOMER>
Elements vs Attributes (3)
- Too many attributes make documents hard to read
- You can't specify document structure using attributes
- Attributes are good for simple information:
<BOOK ID="B1">
Building Well-Formed XML Document Structure
- An XML Declaration should begin the document
<?xml version="1.0" standalone="yes"?>
- Include one or more elements
-
<?xml version="1.0" encoding="UTF-8"?>
<DOCUMENT>
<GREETING>Hello from XML</GREETING>
<MESSAGE>Welcome to Programing XML in Java</MESSAGE>
</DOCUMENT>
Building Well-Formed XML Document Structure (2)
- Include Both Start and End Tags for Elements that aren't empty
<GREETING>Hello from XML</GREETING>
- Close Empty Tags with />
<SUBJECT name="XML"/>
- The Root Element Must Contain All Other Elements
<DOCUMENT> ..... </DOCUMENT>
Building Well-Formed XML Document Structure (3)
Nest Elements Correctly:
<?xml version="1.0" encoding="UTF-8"?>
<DOCUMENT>
<GREETING>Hello from XML</MESSAGE> <--- wrong end tag
<MESSAGE>
Welcome to Programing XML in Java
</GREETING> <--- wrong end tag
</DOCUMENT>
Building Well-Formed XML Document Structure (4)
- Use Unique Attribute Names:
<PERSON LAST_NAME="Smith" LAST_NAME="Punin">
- Use Only the Five Pre-Existing Entity References:
& the & character
< the < character
> the > character
' the ` character
" the " character
Building Well-Formed XML Document Structure (5)
- Surround Attribute Values with Quotes:
<IMG SRC="image.jpg"/>
- Use < and & Only to Start Tags and Entities:
<TOUR CAPTION="The S&O Railway"/>
CDATA Sections
- Hold character data that remains unparsed by the XML Processor
- Start a CDATA section:
<![CDATA[
- End a CDATA section:
]]>
CDATA Sections (2)
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/tr/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>Using The if Statement In JavaScript</title>
</head>
<body>
<script language="javascript">
<![CDATA[
var budget
budget = 234.77
if (budget < 0){
document.writeln("Uh oh.")}
]]>
</script>
<center> <h1>Using The if Statement In JavaScript</h1> </center>
</body>
</html>