Deitel & Associates, Inc. Logo

Back to

The Extensible Markup Language (XML) was developed in 1996 by the World Wide Web Consortium (W3C) XML Working Group. XML is a widely supported open technology for describing data that has become the standard format for data exchanged between applications over the Internet. In this tutorial, we introduce basic XML syntax. We also present an overview of technologies used to parse, validate and format XML documents.

Download the examples for this tutorial here.

[Notes: This tutorial is an excerpt (Section 19.2) of Chapter 19, XML, from our textbook Visual C# 2005 How to Program, 2/e (pages 931-934). This tutorial may refer to other chapters or sections of the book that are not included here. Permission Information: Deitel, Harvey M. and Paul J., Visual C# How to Program, 2/E ©2006. Electronically reproduced by permission of Pearson Education, Inc., Upper Saddle River, New Jersey.]

19.2   XML Basics
XML permits document authors to create markup (i.e., a text-based notation for describing data) for virtually any type of information. This enables document authors to create entirely new markup languages for describing any type of data, such as mathematical formulas, software-configuration instructions, chemical molecular structures, music, news, recipes and financial reports. XML describes data in a way that both human beings and computers can understand.
Figure 19.1 is a simple XML document that describes information for a baseball player. We focus on lines 5-11 to introduce basic XML syntax. You will learn about the other elements of this document in Section 19.3.
XML documents contain text that represents content (i.e., data), such as John (line 6 of Fig. 19.1), and elements that specify the document's structure, such as firstName (line 6 of Fig. 19.1). XML documents delimit elements with start tags and end tags. A start tag consists of the element name in angle brackets (e.g., <player> and <firstName> in lines 5 and 6, respectively). An end tag consists of the element name preceded by a forward slash (/) in angle brackets (e.g., </firstName> and </player> in lines 6 and 11, respectively). An element's start and end tags enclose text that represents a piece of data (e.g., the firstName of the player-John-in line 6, which is enclosed by the <firstName> start tag and </firstName> end tag). Every XML document must have exactly one root element that contains all the other elements. In Fig. 19.1, player (lines 5-11) is the root element.
Fig. 19.1 XML that describes a baseball player's information. 
1   <?xml version = "1.0"?>
2   <!-- Fig. 19.1: player.xml -->
3   <!-- Baseball player structured with XML -->
5   <player>
6      <firstName>John</firstName>
8      <lastName>Doe</lastName>
10     <battingAverage>0.375</battingAverage>
11  </player>
Some XML-based markup languages include XHTML (Extensible HyperText Markup Language-HTML's replacement for marking up Web content), MathML (for mathematics), VoiceXML (for speech), CML (Chemical Markup Language-for chemistry) and XBRL (Extensible Business Reporting Language-for financial data exchange). These markup languages are called XML vocabularies and provide a means for describing particular types of data in standardized, structured ways.
Massive amounts of data are currently stored on the Internet in a variety of formats (e.g., databases, Web pages, text files). Based on current trends, it is likely that much of this data, especially that which is passed between systems, will soon take the form of XML. Organizations see XML as the future of data encoding. Information technology groups are planning ways to integrate XML into their systems. Industry groups are developing custom XML vocabularies for most major industries that will allow computer-based business applications to communicate in common languages. For example, Web services, which we discuss in Chapter 22, allow Web-based applications to exchange data seamlessly through standard protocols based on XML.
The next generation of the Internet and World Wide Web will almost certainly be built on a foundation of XML, which will permit the development of more sophisticated Web-based applications. As is discussed in this chapter, XML allows you to assign meaning to what would otherwise be random pieces of data. As a result, programs can "understand" the data they manipulate. For example, a Web browser might view a street address listed on a simple HTML Web page as a string of characters without any real meaning. In an XML document, however, this data can be clearly identified (i.e., marked up) as an address. A program that uses the document can recognize this data as an address and provide links to a map of that location, driving directions from that location or other location-specific information. Likewise, an application can recognize names of people, dates, ISBN numbers and any other type of XML-encoded data. Based on this data, the application can present users with other related information, providing a richer, more meaningful user experience.
Viewing and Modifying XML Documents
XML documents are highly portable. Viewing or modifying an XML document-which is a text file that ends with the .xml filename extension-does not require special software, although many software tools exist, and new ones are frequently released that make it more convenient to develop XML-based applications. Any text editor that supports ASCII/Unicode characters can open XML documents for viewing and editing. Also, most Web browsers can display XML documents in a formatted manner that makes it easier to see the XML's structure. We demonstrate this using Internet Explorer in Section 19.3. One important characteristic of XML is that it is both human readable and machine readable.
Page 1 | 2
Other XML Tutorials:
Structuring Data

Return to Tutorial Index