XML.

Comprehensive study notes, diagrams, and exam preparation for XML..

XML (Extensible Markup Language)

Definition

XML (Extensible Markup Language) is a markup language designed to store and transport data in a format that is both human-readable and machine-readable. Unlike HTML, which focuses on how to display data, XML focuses on what the data is by using user-defined tags to describe its structure.


Main Content

1. Syntax and Structure

  • XML is self-descriptive, meaning the tags are created by the author to describe the content (e.g., <name>, <price>).
  • It must have a single root element that contains all other elements, ensuring a hierarchical tree structure.

2. Well-Formed vs. Valid XML

  • A "well-formed" XML document strictly follows syntax rules (e.g., every opening tag has a closing tag, case sensitivity).
  • A "valid" XML document follows a specific set of rules defined in an external schema, such as a Document Type Definition (DTD) or XML Schema (XSD).

3. XML Representation

  • Data is organized in a parent-child relationship, forming a tree-like hierarchy.
[Root Element]
      |
  [Child 1] --- [Child 2]
      |            |
 [Sub-child]    [Sub-child]

Working / Process

1. Data Definition

  • Define the data structure by choosing logical tag names that describe the information.
  • Ensure that the document begins with the XML declaration: <?xml version="1.0" encoding="UTF-8"?>.

2. Document Creation

  • Construct the elements according to the hierarchy.
  • Example structure: xml <bookstore> <book> <title>Web Development</title> <author>John Doe</author> </book> </bookstore>

3. Parsing and Processing

  • Use an XML Parser (like DOM or SAX) to read the file into an application.
  • The parser interprets the nodes, allowing the web application to extract specific information for dynamic display.

Advantages / Applications

  • Platform Independence: Since XML is plain text, it can be read by any operating system or programming language (Java, Python, PHP).
  • Data Integration: It acts as the "lingua franca" for exchanging data between different systems, such as database to web application.
  • Customizability: Users can define their own tags based on the specific requirements of their domain or industry.

Summary

XML is a flexible, text-based language used to define and store data structure. It facilitates the exchange of information across diverse software and hardware environments by using custom, hierarchical tags. Key terms to remember include Root Element, Parser, DTD (Document Type Definition), and Schema.