Reading the DocBook Document Type Definition
This chapter explains how to read the DocBook 2.2.1 Document Type Definition
(DTD) and how to use it to create fully compliant Standard Generalized Markup
Language (SGML) help files.
Document Type Definition
A Document Type Definition (DTD) defines a set
of elements used to create a structured (or hierarchical) document. The DTD
specifies the syntax for each element and governs how and where elements can
be used in a document.
DocBook 2.1 DTD
The DocBook 2.2.1 DTD tag set and its associated rules are referred
to as formal markup. The DTD conforms to the Standard Generalized Markup Language
(SGML) ISO specification 8879:1986. This means that you can use formal markup
to create help files that are SGML compliant.
Appendix A contains the complete DTD specification. The DTD is also
available in the Developer's Toolkit. It is located in the /usr/dt/dthelp/dtdocbook/SGML directory and is named DocBook.dtd.
See Also
dtdocbookdtd(4) man page.
DTD Components
The DTD defines each of the DocBook elements described in previous chapters
in a technical notation. This section introduces some key terms and explains
how to read the syntax of the element notations. It does not attempt to fully
describe each section of the DTD.
Element Declarations
The DocBook DTD defines each DocBook element in an element
declaration. The declaration uses a precise notation to describe
an element, its required components, and any elements it can or cannot contain.
Each element also has its attributes and the values they can take defined
in an attribute declaration, which is discussed in the
next section .
Both in its element declarations, and its attribute declarations, the
DocBook DTD makes extensive use of entity references, which stand for entities
that represent groupings of elements or attributes. (In the DTD, these entity
declarations precede the element declarations and the attribute declarations.)
For example, the DTD declares an entity with the reference "%commmonatts;"
to stand for the group of common attributes that so many of the DocBook elements
have: ID, Lang (language), Remap, Role, and XRefLabel. As another example,
the DTD declares an entity with the reference "%list.gp;" that stands for
ItemizedList, OrderedList, SegmentedList, VariableList, etc.
The syntax of an element declaration is as follows:
<!ELEMENT element_type minimization (content model)>
Where:
element_type
Specifies the element name, which is
also used as the tag name. For example, the tag for the element type Title
is <Title>.
minimization
A two-character entry that indicates
whether a start or an end tag is required. The first character represents
the start tag; the second character represents the end tag. A space separates
the two characters. The letter O means that the tag is
optional. A - (minus sign) indicates the tag is required.
For example, an entry like this, - -, indicates
that the element requires both start and end tags. The DTD for DocBook 2.2.1
requires both start and end tags for the great majority of its elements.
content model
Specifies a list of the required and
optional elements that the element type can contain. It defines the sequence
of elements and, if applicable, the number of occurrences that may occur.
It also may specify the elements that cannot be contained within the element
in question.
The content model uses these notations:
|
A vertical bar represents “or”.
+
A plus sign after the name of the element
means the element must appear at least once, and that it can be repeated.
*
An asterisk after the name of the element
means the element can appear zero or more times.
?
A question mark after the name of the
element means the element can appear zero or one time.
,
A comma describes sequence, that is,
the element type before the comma must be followed by the element specified
after the comma.
+ (element_
type(s))
The + (plus sign)
indicates that the listed element or elements enclosed within the parentheses
can be used within the element type or within any of the elements it contains.
This is called an inclusion.
- (element_
type(s))
A - (minus sign)
indicates that the listed element or elements enclosed within the parentheses
cannot be used within this element, or within any of the elements it contains.
This is called an exclusion.
Examples
Each of the following examples shows an element declaration and explains
what it means.
This declares that the Appendix element requires
both starting and ending tags. It further declares that Appendix may contain
an optional DocInfo element, followed by a required Title, and an optional
TitleAbbrev, followed by one or more of the elements referred to by the entity
reference "%sect.gp;" (namely, Sect1 and its permitted subcomponents). It
also declares that the elements referred to by the entity reference "%ubiq.gp;"
(namely, IndexTerms) can be included within an Appendix or within any of its
subcomponents.
<:!ELEMENT Appendix - - (DocInfo?, Title, TitleAbbrev?, (%sect1.gp;)) +(%ubiq.gp;) >
This declares that the OrderedList element requires
both starting and ending tags, and that it must contain at least one ListItem
<!ELEMENT OrderedList - - (ListItem+) >
This declares that the ListItem element requires
both starting and ending tags, and that it must contain at least one of the
group of elements referred to by the entity reference "%component.gp;", which
includes among other things Paragraphs, Lists, and Tables.
<!ELEMENT ListItem - - ((%component.gp;)+) >
This declares that the Sect1 element requires both
starting and ending tags. It further declares that Sect1 has a required Title
and an optional TitleAbbrev. It next declares that Sect1 can have zero or
more ToCs, LoTs, Indexes, Glossaries, and Bibliographies (which are the elements
referred to by the entity reference "%nav.gp;"). It then declares that the
Sect1 element must contain at least one of the group of elements referred
to by the entity reference "%component.gp;", which includes among other things
Paragraphs, Lists,and Tables, and that these will optionally be followed by
zero or more Sect2s or RefEntries.
<!ELEMENT Sect1 - - (Title, TitleAbbrev?, (%nav.gp;)*, (((%component.gp;)+, (RefEntry* | Sect2*)) | RefEntry+ | Sect2+), (%nav.gp;)*) +(%ubiq.gp;) >
This declares that the InformalTable element requires
both starting and ending tags. It further declares that InformalTable must
contain one or more Graphics or one or more TGroups (this is the meaning of
the string referred to by the entity reference "%tblcontent.gp;"). It also
declares that the InformalTable element cannot contain a Table or another
InformalTable.
<!ELEMENT InformalTable - - ((%tblcontent.gp;)) -(Table|InformalTable)>
This declares that the TGroup element requires
a start tag but not an end tag, and may contain the following elements in
the following order: zero or more ColSpecs, zero or more SpanSpecs, zero or
one THead, zero or one TFoot, and a required TBody.
<!ELEMENT TGroup - O (ColSpec*, SpanSpec*, THead?, TFoot?, TBody) >
Element Declaration Keywords
Some elements include a keyword in the element declaration that describes
the data content of the element. Three keywords appear in the DTD: EMPTY, CDATA, and #PCDATA.
EMPTY
Specifies that the element has no data
content.
CDATA
Represents “character data.”
That is, the data content of the element is not recognized as markup.
#PCDATA
Represents “parsed character
data.” That is, the data content may include both text and markup characters
that the DocBook parser interprets accordingly.
Attribute List Declarations
An attribute list declares additional properties that further describe
an element. An attribute list declaration has the syntax:
<!ATTLIST element_type attribute_values default_value>
Examples
Each of the following examples shows an aatribute list declaration and
explains what it means.
This attribute list declaration means that the
element Para has the common attributes, and there are no default values for
them.
<!ATTLIST Para %commonatts; >
This attribute list declaration means that the
element Sect1 has the common attributes, and also a Label attribute and a
Renderas attribute. The Label attribute take "character data" for its values,
and the default value is implied. The Renderas attribute (which can determine
how the Sect1 is displayed) can take the values Sect2, Sect3, Sect4, or Sect5.
For example, if Renderas="Sect2", the Sect1 will be displayed with the same
formatting as a Sect2.
<!ATTLIST Sect1
%commonatts;
Label CDATA #IMPLIED
Renderas (Sect2 | Sect3 | Sect4 | Sect5) #IMPLIED >
This attribute list declaration means that the
element TFoot has the common attributes, with no default values, and also
VAlign attribute which can take the values "Top", "Middle", and "Bottom",
with "Top" as the default value.
<!ATTLIST TFoot
%commonatts;
VAlign (Top | Middle | Bottom) "Top" >
This attribute list declaration means that the
element OrderedList has the common attributes, with no default values, and
also several other attributes.
The Numeration attribute determines how the ListItems in the OrderedList
will be numbered: it takes the values "Arabic" (arabic numbers), "Upperalpha"
(capital letters), "Loweralpha" (lower case letters), "Upperroman" (upper
case Roman numerals) and "Lowerroman" (lower case Roman numerals).
The InheritNum attribute determines whether the numeration of an OrderedList
embedded in another OrderedList will be embedded in the numeration of the
containing list (so that the items in a list embedded in item 2 of another
list might be numbered 2a, 2b, 2c,etc.) InheritNum takes the values "Inherit"
and "Ignore", with "Ignore" as the default.
The Continuation attributes determines whether the numeration of an
OrderedList will continue from the numeration of the preceding OrderedList,
or start anew. It takes the values "Continues" and "Restarts", with "Restarts"
as the default.
<!ATTLIST OrderedList
%commonatts;
Numeration (Arabic|Uperalpha|Loweralpha|Uperroman|Lowerroman)
#IMPLIED
InheritNum (Inherit|Ignore) Ignore
Continuation (Continues|Restarts) Restarts
>
Formal Markup
After you have learned the basic set of elements, using a structured
editor is the best approach for creating formal markup. With a structured
editor, the author creates formal markup by choosing elements from a menu.
In response, the structured editor generates all of the tags required for
each element. In addition, it verifies that the structural framework being
created conforms to the Document Type Definition.
Formal Markup Caveats
DocBook is a formal markup language. Nearly every element requires
a start and an end tag. If the start tag is <ElementName>,
the end tag will take the form </ElementName>,with the
/(forward slash) marking it as the end tag.
In formal markup, each element, its component parts, and elements it
contains must be explicitly tagged. For example, here is a schematic formal
markup for a Row in a Table containing two Entries. (For ease of reading in
this and other markup examples, tags are indented. Indentation is not required
in actual markup.)
<row>
<entry align="left" valign="top">
<para>contents of first entry</para>
</entry>
<entry align="left"valign="top">
<para>contents of second entry</para>
</entry>
<row>
Notice that Entry and Para, the subcomponents of the Row, each have
their own start and end tags.
Explicit Hierarchy of Elements
Each element declaration in the DTD contributes to a set of rules that
governs how and where elements can be used. Because elements contain other
elements, which may contain other elements, a document is a hierarchy of
elements. At the top level, the Part element is the container for every other
element in the help volume.
To decide what markup is necessary to create a help topic, you need
to become familiar with the rules that govern the DocBook markup laguage.
One way to learn the markup language would be to study the element declarations
for the components you need to use. For example, suppose you want to create
a chapter. First, look at the declaration for the Chapter element listed below.
<!ELEMENT Chapter - - (DocInfo?, Title, TitleAbbrev?, (%sect1.gp;), (Index |
Glossary | Bibliography)*) +(%ubiq.gp;) >
This tells you a Chapter may have a DocInfo component. So next you look
at the declaration for DocInfo, to see how it is constructed.
<!ELEMENT DocInfo - - (Title, TitleAbbrev?, Subtitle?, AuthorGroup+, Abstract*, RevHistory?, LegalNotice*) -(%ubiq.gp;) >
This tells you that a DocInfo requires at least a Title and one or more
AuthorGroups, and may optionally contain various other elements. So next
you would have check into the declarations for the Title element and the AuthorGroup
element, to see how they are constructed.
<!ELEMENT Title - - ((%inlinechar.gp;)+) >
<!ELEMENT AuthorGroup - - ((Author | Editor | Collab | CorpAuthor |
OtherCredit)+) >
By continuing along in this fashion until you have investigated all
the subcomponents of a Chapter, and all the subcomponents of the subcomponents,
down to the innermost nested element, and mastered how they work, you could
learn how to construct a Chapter.
Fortunately, however, using a structured editor minimizes what an author
needs to know about the DTD and the syntx of the markup tags. The structured
editor application “reads” the DTD and creates each element's
required tags, many of which are intermediate structural tags.
Example
This formal markup sample is an excerpt from the desktop Text Editor
help volume. To view the corresponding online information, choose the Help
Viewer in the Front Panel. Select Common Desktop Environment and then choose
Text Editor Help from the listed volumes. In the Text Editor volume, choose
Text Editor Tasks and then To Open an Existing Document.
Indentation and extra white space is used in this example to make it
easier to read the text and corresponding element tags. Remember that using
indentation and extra white space is not necessary in actual markup.
<sect2 id=“TOOPENANEXISTINGDOCUMENT”>
<title>To Open an Existing Document</title>
<para>You can use Text Editor or File Manager to open an existing document.</para>
<IndexTerm><primary>document <secondary>opening</secondary>
</primary></IndexTerm>
<IndexTerm><primary>opening
<secondary>existing document</secondary>
</primary></IndexTerm>
<para>To open an existing document from the Text Editor:</para>
<OrderedList>
<ListItem>
<para> Choose Open from the File menu.</para>
<para> The Open a File dialog box lists files and folders on your system. You can browse the documents listed, or change to a new folder to locate other files on your system.</para>
</ListItem>
<ListItem>
<para> Select the document you want to open in the Files list or type the file name in the Open a File field. </para>
<para><emphasis>Or,</emphasis> if the document is not in the current folder, first change to the folder that contains your document. Then choose a name in the Folders list or type the path name of the folder you wish to change to in the Enter path or folder name field.</para>
</ListItem>
<ListItem>
<para> Press Return or click OK.</para>
</ListItem>
</OrderedList>
<graphic id="some-graphic-id" entityref="some-graphic-entity"></graphic>
<para>To open an existing document from the File Manager:</para>
<IndexTerm><primary>opening
<secondary>document from File Manager</secondary>
</primary></IndexTerm>
<IndexTerm><primary>document
<secondary>opening from File Manager</secondary>
</primary></IndexTerm>
<IndexTerm><primary>File Manager
<secondary>opening documents</secondary>
</primary></IndexTerm>
<OrderedList>
<ListItem>
<para>Display the document's file icon in a File Manager Window.</para>
</ListItem>
<ListItem>
<para> Do one of the following: </para>
<InformalList>
<ListItem>
<para>Double-click the document's file icon.</para>
</ListItem>
<ListItem>
<para>Select the document, then choose Open from the Selected menu.</para>
</ListItem>
<ListItem>
<para>Drag the document to the Text Editor's control in the Front Panel.</para>
</ListItem>
</InformalList>
</ListItem>
</OrderedList>
<sect3>
<title>See Also</title>
<InformalList>
<ListItem>
<para><xref linkend="some-sect-id" endterm="some-sects-title-id"></para>
</ListItem>
<ListItem>
<para><xref linkend="another-sect-id" endterm="another-sects-title-id"></para>
</ListItem>
<ListItem>
<para><xref linkend="some-other-sect-id" endterm="some-other-sects-title-id"></para>
</ListItem>
</InformalList>
<sect3>
<sect2>
File Entity Declarations
To declare a file entity in formal markup, use this syntax:
<!entity entityname SYSTEM " filename">
Where entitynameis the name of the entity and filename is the name of the file. The keyword SYSTEM is required.
Example
Here are the entity declarations for a help volume that consists of
three text files and contains a graphic image.
<!entity MetaInformation SYSTEM "metainfo">
<!entity BasicTasks SYSTEM "basics">
<!entity AdvancedFeatures SYSTEM "advanced">
<!entity process_diagram SYSTEM "process.tif">