XML 101
XML stands for Extensible Markup Language, and is a markup language for structured data. In LEAF, you will encounter XML through TEI-XML, a subset of XML which you can learn about here.
XML uses tags (or elements), which appear in pointed brackets (<>
), to represent and describe data in a machine-readable way. An XML element requires an opening tag and a closing tag, which will contain a backslash (for example, <person>Elizabeth I</person>
). Some XML elements can also be nested within other elements (<title><person>Antony</person> and <person>Cleopatra</person></title>
).
XML vs HTML
XML is related to HTML (Hyper Text Markup Language), which also uses pointed bracket tags to make the text machine readable. However, there are some key differences between HTML and XML, and discussing these will help you to have a better grasp on both.
While both HTML and XML use tags to describe the text within (for example, <title>
), HTML is a presentational language. HTML is used to determine how to format and display text, leading it to be the language structuring many webpages. XML is not a presentational language, and focuses more on data storage. Both HTML and XML can be styled through the use of CSS (Cascading Style Sheets).
Another major difference is flexibility. HTML has a fixed set of tags that can be used, while XML is dynamic, allowing a user to develop their own tags and standards (such as TEI). This means that valid HTML will often also be valid XML, but valid XML will rarely be valid HTML.