RU beehive logo ITEC dept promo banner
ITEC 325
2018fall
ibarland

DTDs
tags and attributes

These notes influenced by XML Visual Quickstart Guide by Kevin Howard Goldberg, and notes therefrom by Jack Davis (jcdavis@radford.edu).

We've seen quick examples of some standardized file-formats using their own variant of XML: iTunes collections, .svg files, Word documents. We also saw a made-up set of tags about children, as well as the books' made-up set of tags about ancient wonders. Actually, even the “standardized” file formats are just somebody who originally made up tags, to describe the hierarchical information they wanted to represent. But if the tags are made-up, who's to declare if people are using them correctly?

video: DTD: Introduction; Elements (22m59s)

When defining a new XML language, you must specify its grammar — what tags are allowed, what child-tags they may (or, must) contain. Similarly, for attributes — what attributes are required on certain tags (like “img” tags must have a “src” attribute and otherwise be empty), and what the allowed values are for attributes.

In fact, you can compare any XML document to its corresponding grammar to validate whether it conforms to the rules specified in the schema. If an XML document is deemed valid, then it data is in the proper form as specified by the schema. (Of course, just as a syntax-checker can validate a Java program's syntax, it won't validate its meaning; similarly schema's won't validate an XML document's logical content.) This isn't as much a problem for XML documents (especially small ones) as for programs. For databases implemented as large XML files, people might build other ad hoc tools to run sanity checks on the contents of an XML document's meaning — check that a wonder's year-destroyed isn't less than its year-built, that links actually resolve, etc..

There are two common formats for specifying XML grammars: DTDs, and XML Schema. A DTD, “Document Type Definition”, is an older but widely used system with a peculiar and limited syntax. However, they are lightweight: compact and easily comprehended with a little study. Since they are relatively simple and still widely used, studying them is a good first step in understanding XML tag set definition. A DTD is a text-only document itself and therefore does not begin with the standard XML declaration.

The three things a DTD specifies

An example file: the textbook's ch06-wonders.dtd, as referenced in ch06-wonders.xml (do a view-source, line 3)

Where to put the DTD file

You can have the DTD as an external file, or in-document (similar to providing css-files).

  1. In-document, inside the same file as the XML:
    <!DOCTYPE ancient_wonders [
        <!ELEMENT ancient_wonders wonders*>
        <!ELEMENT wonder (name+, )>
        
    ]>
    
    <ancient_wonders>
        <wonder>
            <name>Great Pyramid of Ghiza</name>
            
        </wonder>
    </ancient_wonders>    
                  
    This is the lightest-weight solution, and is suitable for the homework assignment.
  2. For an external document file on your own computer, you can use SYSTEM Start your xml file with a DOCTYPE specifying where the dtd is found, e.g. <!DOCTYPE ancient_wonders SYSTEM "wonders.dtd">. This is what the above ch06-wonders.xml does (remember to view-source, line 3).
  3. An external document on the interwebs, using PUBLIC: <!DOCTYPE ancient_wonders PUBLIC "-//ibarland//DTD Archaeological Wonders 1.0//EN" "https://php.radford.edu/~itec325/2016spring-ibarland/Lectures/wonders.dtd">. See here for the syntax of the word after “PUBLIC”. (And: on your homework, prefer the SYSTEM technique to this one.)

Defining your XML: Elements, Attributes, and Entities


logo for creative commons by-attribution license
This page licensed CC-BY 4.0 Ian Barland
Page last generated
Please mail any suggestions
(incl. typos, broken links)
to ibarlandradford.edu
Rendered by Racket.