Deitel & Associates, Inc. Logo

Back to

In our XML Basics Tutorial, we introduced XML syntax. We also presented an overview of technologies used to parse, validate and format XML documents. In this tutorial, we provide XML markup for an article and for a business letter. We also demonstrate tools for validating XML markup. This tutorial is intended for people who have read our XML Basics Tutorial or who are familiar with basic XML syntax.

Download the examples for this tutorial here.

[Notes: This tutorial is an excerpt (Section 19.3) of Chapter 19, XML, from our textbook Visual C# 2005 How to Program, 2/e (pages 934-940). This tutorial may refer to other chapters or sections of the book that are not included here. Permission Information: Deitel, Harvey M. and Paul J., Visual C# How to Program, 2/E ©2006. Electronically reproduced by permission of Pearson Education, Inc., Upper Saddle River, New Jersey.]

19.3   Structuring Data
In this section and throughout this chapter, we create our own XML markup. XML allows you to describe data precisely in a well-structured format.
XML Markup for an Article
In Fig. 19.2, we present an XML document that marks up a simple article using XML. The line numbers shown are for reference only and are not part of the XML document.
Fig. 19.2 XML used to mark up an article.
1   <?xml version = "1.0"?>
2   <!-- Fig. 19.2: article.xml -->
3   <!-- Article structured with XML -->
5   <article>
6      <title>Simple XML</title>
8      <date>May 5, 2005</date>
10     <author>
11        <firstName>John</firstName>
12        <lastName>Doe</lastName>
13     </author>
15     <summary>XML is pretty easy.</summary>
17     <content>
18        In this chapter, we present a wide variety of examples that use XML.
19     </content>
20  </article>
This document begins with an XML declaration (line 1), which identifies the document as an XML document. The version attribute specifies the XML version to which the document conforms. The current XML standard is version 1.0. Though the W3C released a version 1.1 specification in February 2004, this newer version is not yet widely supported. The W3C may continue to release new versions as XML evolves to meet the requirements of different fields.
Portability Tip 19.1
Documents should include the XML declaration to identify the version of XML used. A document that lacks an XML declaration might be assumed to conform to the latest version of XML-when it does not, errors could result.
Common Programming Error 19.1
Placing whitespace characters before the XML declaration is an error.
XML comments (lines 2-3), which begin with <!-- and end with -->, can be placed almost anywhere in an XML document. XML comments can span to multiple lines-an end marker on each line is not needed; the end marker can appear on a subsequent line as long as there is exactly one end marker (-->) for each begin marker (<!--). As in a C# program, comments are used in XML for documentation purposes. Line 4 is a blank line. As in a C# program, blank lines, whitespaces and indentation are used in XML to improve readability. Later you will see that the blank lines are normally ignored by XML parsers.
Common Programming Error 19.2
In an XML document, each start tag must have a matching end tag; omitting either tag is an error. Soon, you will learn how such errors are detected.
Common Programming Error 19.3
XML is case sensitive. Using different cases for the start tag and end tag names for the same element is a syntax error.
In Fig. 19.2, article (lines 5-20) is the root element. The lines that precede the root element (lines 1-4) are the XML prolog. In an XML prolog, the XML declaration must appear before the comments and any other markup.
The elements we used in the example do not come from any specific markup language. Instead, we chose the element names and markup structure that best describe our particular data. You can invent elements to mark up your data. For example, element title (line 6) contains text that describes the article's title (e.g., Simple XML). Similarly, date(line 8), author(lines 10-13), firstName (line 11), lastName (line 12), summary(line 15) and content (lines 17-19) contain text that describes the date, author, the author's first name, the author's last name, a summary and the content of the document, respectively. XML element names can be of any length and may contain letters, digits, underscores, hyphens and periods. However, they must begin with either a letter or an underscore, and they should not begin with "xml" in any combination of uppercase and lowercase letters (e.g., XML, Xml, xMl) as this is reserved for use in the XML standards.
Common Programming Error 19.4
Using a whitespace character in an XML element name is an error.
Good Programming Practice 19.1
XML element names should be meaningful to humans and should not use abbreviations.
XML elements are nested to form hierarchies-with the root element at the top of the hierarchy. This allows document authors to create parent/child relationships between data. For example, elements title, date, author, summary and content are nested within article. Elements firstName and lastName are nested within author. Figure 19.21 shows the hierarchy of Fig. 19.2.
Common Programming Error 19.5
Nesting XML tags improperly is a syntax error. For example, <x><y>hello</x></y> is an error, because the </y> tag must precede the </x> tag.
Any element that contains other elements (e.g., article or author) is a container element. Container elements also are called parent elements. Elements nested inside a container element are child elements (or children) of that container element.
Page 1 | 2 | 3 | 4
Other XML Tutorials:
XML Basics

Return to Tutorial Index