Reading the DocBook Document Type Definition This chapter explains how to read the DocBook 2.2.1 Document Type Definition (DTD) and how to use it to create fully compliant Standard Generalized Markup Language (SGML) help files. Document Type Definition A Document Type Definition (DTD) defines a set of elements used to create a structured (or hierarchical) document. The DTD specifies the syntax for each element and governs how and where elements can be used in a document. DocBook 2.1 DTD The DocBook 2.2.1 DTD tag set and its associated rules are referred to as formal markup. The DTD conforms to the Standard Generalized Markup Language (SGML) ISO specification 8879:1986. This means that you can use formal markup to create help files that are SGML compliant. Appendix A contains the complete DTD specification. The DTD is also available in the Developer's Toolkit. It is located in the /usr/dt/dthelp/dtdocbook/SGML directory and is named DocBook.dtd. See Also dtdocbookdtd(4) man page. DTD Components The DTD defines each of the DocBook elements described in previous chapters in a technical notation. This section introduces some key terms and explains how to read the syntax of the element notations. It does not attempt to fully describe each section of the DTD. Element Declarations The DocBook DTD defines each DocBook element in an element declaration. The declaration uses a precise notation to describe an element, its required components, and any elements it can or cannot contain. Each element also has its attributes and the values they can take defined in an attribute declaration, which is discussed in the next section . Both in its element declarations, and its attribute declarations, the DocBook DTD makes extensive use of entity references, which stand for entities that represent groupings of elements or attributes. (In the DTD, these entity declarations precede the element declarations and the attribute declarations.) For example, the DTD declares an entity with the reference "%commmonatts;" to stand for the group of common attributes that so many of the DocBook elements have: ID, Lang (language), Remap, Role, and XRefLabel. As another example, the DTD declares an entity with the reference "%list.gp;" that stands for ItemizedList, OrderedList, SegmentedList, VariableList, etc. The syntax of an element declaration is as follows: <!ELEMENT element_type minimization (content model)> Where: element_type Specifies the element name, which is also used as the tag name. For example, the tag for the element type Title is <Title>. minimization A two-character entry that indicates whether a start or an end tag is required. The first character represents the start tag; the second character represents the end tag. A space separates the two characters. The letter O means that the tag is optional. A - (minus sign) indicates the tag is required. For example, an entry like this, - -, indicates that the element requires both start and end tags. The DTD for DocBook 2.2.1 requires both start and end tags for the great majority of its elements. content model Specifies a list of the required and optional elements that the element type can contain. It defines the sequence of elements and, if applicable, the number of occurrences that may occur. It also may specify the elements that cannot be contained within the element in question. The content model uses these notations: | A vertical bar represents “or”. + A plus sign after the name of the element means the element must appear at least once, and that it can be repeated. * An asterisk after the name of the element means the element can appear zero or more times. ? A question mark after the name of the element means the element can appear zero or one time. , A comma describes sequence, that is, the element type before the comma must be followed by the element specified after the comma. + (element_ type(s)) The + (plus sign) indicates that the listed element or elements enclosed within the parentheses can be used within the element type or within any of the elements it contains. This is called an inclusion. - (element_ type(s)) A - (minus sign) indicates that the listed element or elements enclosed within the parentheses cannot be used within this element, or within any of the elements it contains. This is called an exclusion. Examples Each of the following examples shows an element declaration and explains what it means. This declares that the Appendix element requires both starting and ending tags. It further declares that Appendix may contain an optional DocInfo element, followed by a required Title, and an optional TitleAbbrev, followed by one or more of the elements referred to by the entity reference "%sect.gp;" (namely, Sect1 and its permitted subcomponents). It also declares that the elements referred to by the entity reference "%ubiq.gp;" (namely, IndexTerms) can be included within an Appendix or within any of its subcomponents. <:!ELEMENT Appendix - - (DocInfo?, Title, TitleAbbrev?, (%sect1.gp;)) +(%ubiq.gp;) > This declares that the OrderedList element requires both starting and ending tags, and that it must contain at least one ListItem <!ELEMENT OrderedList - - (ListItem+) > This declares that the ListItem element requires both starting and ending tags, and that it must contain at least one of the group of elements referred to by the entity reference "%component.gp;", which includes among other things Paragraphs, Lists, and Tables. <!ELEMENT ListItem - - ((%component.gp;)+) > This declares that the Sect1 element requires both starting and ending tags. It further declares that Sect1 has a required Title and an optional TitleAbbrev. It next declares that Sect1 can have zero or more ToCs, LoTs, Indexes, Glossaries, and Bibliographies (which are the elements referred to by the entity reference "%nav.gp;"). It then declares that the Sect1 element must contain at least one of the group of elements referred to by the entity reference "%component.gp;", which includes among other things Paragraphs, Lists,and Tables, and that these will optionally be followed by zero or more Sect2s or RefEntries. <!ELEMENT Sect1 - - (Title, TitleAbbrev?, (%nav.gp;)*, (((%component.gp;)+, (RefEntry* | Sect2*)) | RefEntry+ | Sect2+), (%nav.gp;)*) +(%ubiq.gp;) > This declares that the InformalTable element requires both starting and ending tags. It further declares that InformalTable must contain one or more Graphics or one or more TGroups (this is the meaning of the string referred to by the entity reference "%tblcontent.gp;"). It also declares that the InformalTable element cannot contain a Table or another InformalTable. <!ELEMENT InformalTable - - ((%tblcontent.gp;)) -(Table|InformalTable)> This declares that the TGroup element requires a start tag but not an end tag, and may contain the following elements in the following order: zero or more ColSpecs, zero or more SpanSpecs, zero or one THead, zero or one TFoot, and a required TBody. <!ELEMENT TGroup - O (ColSpec*, SpanSpec*, THead?, TFoot?, TBody) > Element Declaration Keywords Some elements include a keyword in the element declaration that describes the data content of the element. Three keywords appear in the DTD: EMPTY, CDATA, and #PCDATA. EMPTY Specifies that the element has no data content. CDATA Represents “character data.” That is, the data content of the element is not recognized as markup. #PCDATA Represents “parsed character data.” That is, the data content may include both text and markup characters that the DocBook parser interprets accordingly. Attribute List Declarations An attribute list declares additional properties that further describe an element. An attribute list declaration has the syntax: <!ATTLIST element_type attribute_values default_value> Examples Each of the following examples shows an aatribute list declaration and explains what it means. This attribute list declaration means that the element Para has the common attributes, and there are no default values for them. <!ATTLIST Para %commonatts; > This attribute list declaration means that the element Sect1 has the common attributes, and also a Label attribute and a Renderas attribute. The Label attribute take "character data" for its values, and the default value is implied. The Renderas attribute (which can determine how the Sect1 is displayed) can take the values Sect2, Sect3, Sect4, or Sect5. For example, if Renderas="Sect2", the Sect1 will be displayed with the same formatting as a Sect2. <!ATTLIST Sect1 %commonatts; Label CDATA #IMPLIED Renderas (Sect2 | Sect3 | Sect4 | Sect5) #IMPLIED > This attribute list declaration means that the element TFoot has the common attributes, with no default values, and also VAlign attribute which can take the values "Top", "Middle", and "Bottom", with "Top" as the default value. <!ATTLIST TFoot %commonatts; VAlign (Top | Middle | Bottom) "Top" > This attribute list declaration means that the element OrderedList has the common attributes, with no default values, and also several other attributes. The Numeration attribute determines how the ListItems in the OrderedList will be numbered: it takes the values "Arabic" (arabic numbers), "Upperalpha" (capital letters), "Loweralpha" (lower case letters), "Upperroman" (upper case Roman numerals) and "Lowerroman" (lower case Roman numerals). The InheritNum attribute determines whether the numeration of an OrderedList embedded in another OrderedList will be embedded in the numeration of the containing list (so that the items in a list embedded in item 2 of another list might be numbered 2a, 2b, 2c,etc.) InheritNum takes the values "Inherit" and "Ignore", with "Ignore" as the default. The Continuation attributes determines whether the numeration of an OrderedList will continue from the numeration of the preceding OrderedList, or start anew. It takes the values "Continues" and "Restarts", with "Restarts" as the default. <!ATTLIST OrderedList %commonatts; Numeration (Arabic|Uperalpha|Loweralpha|Uperroman|Lowerroman) #IMPLIED InheritNum (Inherit|Ignore) Ignore Continuation (Continues|Restarts) Restarts > Formal Markup After you have learned the basic set of elements, using a structured editor is the best approach for creating formal markup. With a structured editor, the author creates formal markup by choosing elements from a menu. In response, the structured editor generates all of the tags required for each element. In addition, it verifies that the structural framework being created conforms to the Document Type Definition. Formal Markup Caveats DocBook is a formal markup language. Nearly every element requires a start and an end tag. If the start tag is <ElementName>, the end tag will take the form </ElementName>,with the /(forward slash) marking it as the end tag. In formal markup, each element, its component parts, and elements it contains must be explicitly tagged. For example, here is a schematic formal markup for a Row in a Table containing two Entries. (For ease of reading in this and other markup examples, tags are indented. Indentation is not required in actual markup.) <row> <entry align="left" valign="top"> <para>contents of first entry</para> </entry> <entry align="left"valign="top"> <para>contents of second entry</para> </entry> <row> Notice that Entry and Para, the subcomponents of the Row, each have their own start and end tags. Explicit Hierarchy of Elements Each element declaration in the DTD contributes to a set of rules that governs how and where elements can be used. Because elements contain other elements, which may contain other elements, a document is a hierarchy of elements. At the top level, the Part element is the container for every other element in the help volume. To decide what markup is necessary to create a help topic, you need to become familiar with the rules that govern the DocBook markup laguage. One way to learn the markup language would be to study the element declarations for the components you need to use. For example, suppose you want to create a chapter. First, look at the declaration for the Chapter element listed below. <!ELEMENT Chapter - - (DocInfo?, Title, TitleAbbrev?, (%sect1.gp;), (Index | Glossary | Bibliography)*) +(%ubiq.gp;) > This tells you a Chapter may have a DocInfo component. So next you look at the declaration for DocInfo, to see how it is constructed. <!ELEMENT DocInfo - - (Title, TitleAbbrev?, Subtitle?, AuthorGroup+, Abstract*, RevHistory?, LegalNotice*) -(%ubiq.gp;) > This tells you that a DocInfo requires at least a Title and one or more AuthorGroups, and may optionally contain various other elements. So next you would have check into the declarations for the Title element and the AuthorGroup element, to see how they are constructed. <!ELEMENT Title - - ((%inlinechar.gp;)+) > <!ELEMENT AuthorGroup - - ((Author | Editor | Collab | CorpAuthor | OtherCredit)+) > By continuing along in this fashion until you have investigated all the subcomponents of a Chapter, and all the subcomponents of the subcomponents, down to the innermost nested element, and mastered how they work, you could learn how to construct a Chapter. Fortunately, however, using a structured editor minimizes what an author needs to know about the DTD and the syntx of the markup tags. The structured editor application “reads” the DTD and creates each element's required tags, many of which are intermediate structural tags. Example This formal markup sample is an excerpt from the desktop Text Editor help volume. To view the corresponding online information, choose the Help Viewer in the Front Panel. Select Common Desktop Environment and then choose Text Editor Help from the listed volumes. In the Text Editor volume, choose Text Editor Tasks and then To Open an Existing Document. Indentation and extra white space is used in this example to make it easier to read the text and corresponding element tags. Remember that using indentation and extra white space is not necessary in actual markup. <sect2 id=“TOOPENANEXISTINGDOCUMENT”> <title>To Open an Existing Document</title> <para>You can use Text Editor or File Manager to open an existing document.</para> <IndexTerm><primary>document <secondary>opening</secondary> </primary></IndexTerm> <IndexTerm><primary>opening <secondary>existing document</secondary> </primary></IndexTerm> <para>To open an existing document from the Text Editor:</para> <OrderedList> <ListItem> <para> Choose Open from the File menu.</para> <para> The Open a File dialog box lists files and folders on your system. You can browse the documents listed, or change to a new folder to locate other files on your system.</para> </ListItem> <ListItem> <para> Select the document you want to open in the Files list or type the file name in the Open a File field. </para> <para><emphasis>Or,</emphasis> if the document is not in the current folder, first change to the folder that contains your document. Then choose a name in the Folders list or type the path name of the folder you wish to change to in the Enter path or folder name field.</para> </ListItem> <ListItem> <para> Press Return or click OK.</para> </ListItem> </OrderedList> <graphic id="some-graphic-id" entityref="some-graphic-entity"></graphic> <para>To open an existing document from the File Manager:</para> <IndexTerm><primary>opening <secondary>document from File Manager</secondary> </primary></IndexTerm> <IndexTerm><primary>document <secondary>opening from File Manager</secondary> </primary></IndexTerm> <IndexTerm><primary>File Manager <secondary>opening documents</secondary> </primary></IndexTerm> <OrderedList> <ListItem> <para>Display the document's file icon in a File Manager Window.</para> </ListItem> <ListItem> <para> Do one of the following: </para> <InformalList> <ListItem> <para>Double-click the document's file icon.</para> </ListItem> <ListItem> <para>Select the document, then choose Open from the Selected menu.</para> </ListItem> <ListItem> <para>Drag the document to the Text Editor's control in the Front Panel.</para> </ListItem> </InformalList> </ListItem> </OrderedList> <sect3> <title>See Also</title> <InformalList> <ListItem> <para><xref linkend="some-sect-id" endterm="some-sects-title-id"></para> </ListItem> <ListItem> <para><xref linkend="another-sect-id" endterm="another-sects-title-id"></para> </ListItem> <ListItem> <para><xref linkend="some-other-sect-id" endterm="some-other-sects-title-id"></para> </ListItem> </InformalList> <sect3> <sect2> File Entity Declarations To declare a file entity in formal markup, use this syntax: <!entity entityname SYSTEM " filename"> Where entitynameis the name of the entity and filename is the name of the file. The keyword SYSTEM is required. Example Here are the entity declarations for a help volume that consists of three text files and contains a graphic image. <!entity MetaInformation SYSTEM "metainfo"> <!entity BasicTasks SYSTEM "basics"> <!entity AdvancedFeatures SYSTEM "advanced"> <!entity process_diagram SYSTEM "process.tif">