Initial import of the CDE 2.1.30 sources from the Open Group.
This commit is contained in:
405
cde/programs/nsgmls/doc/archform.htm
Normal file
405
cde/programs/nsgmls/doc/archform.htm
Normal file
@@ -0,0 +1,405 @@
|
||||
<!-- $XConsortium: archform.htm /main/1 1996/09/22 18:14:21 rws $ -->
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<TITLE>Architectural Form Processing</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
<H1>Architectural Form Processing</H1>
|
||||
<P>
|
||||
The Hytime standard (ISO/IEC 10744) introduced the concept of
|
||||
architectural forms. This document assumes you are already familiar
|
||||
with this concept. The first Technical Corrigendum to HyTime, which is
|
||||
soon to be published, generalizes this, and makes it possible to have
|
||||
an <I>architecture engine</I> which can perform architectural form
|
||||
processing for arbitrary architectures. SP now includes such an
|
||||
architecture engine.
|
||||
<P>
|
||||
Non-markup sensitive applications built using SP now support
|
||||
architectural form processing using the <SAMP>-A
|
||||
<VAR>archname</VAR></SAMP> option. When this option is specified, the
|
||||
document will be validated against all declared base architectures,
|
||||
and the output will be for the architectural document for that
|
||||
architecture: the element types, notations and attributes will be
|
||||
those defined in the meta-DTD.
|
||||
<P>
|
||||
This option is experimental and has not been subject to much testing.
|
||||
Please be sure to report any bugs or problems you encounter.
|
||||
<P>
|
||||
Although spam does not support the <SAMP>-A</SAMP> option because it
|
||||
works with the markup of your document, sgmlnorm does.
|
||||
|
||||
<H2>Architectural Support Attributes</H2>
|
||||
<P>
|
||||
To use the <SAMP>-A</SAMP> option with a document, you must add
|
||||
<UL>
|
||||
<LI>
|
||||
an architecture base declaration for <SAMP><VAR>archname</VAR></SAMP>,
|
||||
<LI>
|
||||
a notation declaration and associated attribute definition list
|
||||
declaration for <SAMP><VAR>archname</VAR></SAMP>;
|
||||
this is called the <I>architecture notation declaration</I>.
|
||||
</UL>
|
||||
<P>
|
||||
An architecture base declaration is a processing instruction of the form:
|
||||
<PRE>
|
||||
<?ArcBase <VAR>archname</VAR>>
|
||||
</PRE>
|
||||
<P>
|
||||
The processing instruction is recognized either in the DTD or in an
|
||||
active LPD.
|
||||
<P>
|
||||
The architecture notation declaration and associated attribute
|
||||
definition list declaration serve to declare a number of architectural
|
||||
support attributes which control the architecture engine. The value
|
||||
for each architecture support attribute is taken from the default
|
||||
value, if any, specified for that attribute in the attribute
|
||||
definition list declaration. It is an error to declare an
|
||||
architecture support attribute as <SAMP>#REQUIRED</SAMP>.
|
||||
<P>
|
||||
The following architectural support attributes are recognized:
|
||||
<DL>
|
||||
<DT>
|
||||
<SAMP>ArcDTD</SAMP>
|
||||
<DD>
|
||||
The name of an external entity that contains the meta-DTD.
|
||||
This attribute is required.
|
||||
If the name starts with the PERO delimiter <SAMP>%</SAMP>,
|
||||
the entity is a parameter entity,
|
||||
otherwise it is a general entity.
|
||||
<DT>
|
||||
<SAMP>ArcQuant</SAMP>
|
||||
<DD>
|
||||
A list of tokens that looks like what follows <SAMP>QUANTITY SGMLREF</SAMP>
|
||||
in the quantity set section of an SGML declaration.
|
||||
The quantities used for parsing the meta-DTD
|
||||
and validating the architectural document
|
||||
will be the maximum of the quantities in the document's concrete syntax
|
||||
and the quantities specified here.
|
||||
<DT>
|
||||
<SAMP>ArcDocF</SAMP>
|
||||
<DD>
|
||||
The name of the document element type in the meta-DTD.
|
||||
This would be <SAMP>HyDoc</SAMP> for HyTime.
|
||||
This defaults to <SAMP><VAR>archname</VAR></SAMP>.
|
||||
<DT>
|
||||
<SAMP>ArcFormA</SAMP>
|
||||
<DD>
|
||||
The name of the attribute that elements use to specify the
|
||||
corresponding element type, if any, in the meta-DTD.
|
||||
Data entities also use this attribute to specify the corresponding
|
||||
notation in the meta-DTD.
|
||||
This would be <SAMP>HyTime</SAMP> for HyTime.
|
||||
This defaults to <SAMP><VAR>archname</VAR></SAMP>.
|
||||
<DT>
|
||||
<SAMP>ArcNamrA</SAMP>
|
||||
<DD>
|
||||
The name of the attribute that elements use to specify substitutes for
|
||||
the names of attributes in the meta-DTD. A value of
|
||||
<SAMP>#DEFAULT</SAMP> is allowed for a substitute name; this inhibits
|
||||
mapping of an attribute to an architectural attribute, but specifies
|
||||
that the value of the architectural attribute should be defaulted
|
||||
rather than taken from the value of another attribute in the document.
|
||||
For HyTime the value of this attribute would be <SAMP>HyNames</SAMP>.
|
||||
By default no attribute name substitutition is done.
|
||||
<DT>
|
||||
<SAMP>ArcSuprA</SAMP>
|
||||
<DD>
|
||||
The name of an attribute that elements may use to suppress processing
|
||||
of their descendants. This attribute is not recognized for data
|
||||
entities. The value of the attribute must be one of the following
|
||||
tokens:
|
||||
<DL>
|
||||
<DT>
|
||||
<SAMP>sArcAll</SAMP>
|
||||
<DD>
|
||||
Completely suppress all architectural processing of descendants.
|
||||
It is not possible to restore architectural processing
|
||||
for a descendant.
|
||||
<DT>
|
||||
<SAMP>sArcForm</SAMP>
|
||||
<DD>
|
||||
Suppress processing of the <SAMP>ArcFormA</SAMP> attribute of all
|
||||
descendants of this element, except for those elements that have a
|
||||
non-implied <SAMP>ArcSuprA</SAMP> attribute.
|
||||
<DT>
|
||||
<SAMP>sArcNone</SAMP>
|
||||
<DD>
|
||||
Don't suppress architectural processing for the descendants of
|
||||
this element.
|
||||
</DL>
|
||||
<P>
|
||||
The value may also be implied, in which case the state of
|
||||
architectural processing is inherited.
|
||||
<P>
|
||||
If an element has an ArcSuprA attribute that was processed, its
|
||||
ArcFormA attribute will always be processed. Otherwise its ArcFormA
|
||||
attribute will be processed unless its closest ancestor that has a
|
||||
non-implied value for the ArcSuprA attribute suppressed processing of
|
||||
the ArcFormA attribute. An element whose ArcFormA attribute is
|
||||
processed will not be treated as architectural if it has an implied
|
||||
value for the ArcFormA attribute.
|
||||
<DT>
|
||||
<SAMP>ArcSuprF</SAMP>
|
||||
<DD>
|
||||
The name of the element type in the meta-DTD that suppresses
|
||||
architectural processing in the same manner as does the
|
||||
<SAMP>sHyTime</SAMP> form in HyTime. By default, no element type
|
||||
does. This behaves like an element with an
|
||||
<SAMP>ArcSuprA</SAMP> attribute of <SAMP>sArcForm</SAMP>. The element
|
||||
type should be declared in the meta-DTD. You should not specify a
|
||||
value for this attribute if you specified a value for the
|
||||
<SAMP>ArcSuprA</SAMP> attribute.
|
||||
<P>
|
||||
This is a non-standardized extension.
|
||||
<DT>
|
||||
<SAMP>ArcIgnDA</SAMP>
|
||||
<DD>
|
||||
The name of an attribute that elements may use to control whether
|
||||
data is ignored.
|
||||
The value of the attribute must be one of the following values:
|
||||
<DL>
|
||||
<DT>
|
||||
<SAMP>nArcIgnD</SAMP>
|
||||
<DD>
|
||||
Data is not ignored.
|
||||
It is an error if data occurs where not allowed by the meta-DTD.
|
||||
<DT>
|
||||
<SAMP>cArcIgnD</SAMP>
|
||||
<DD>
|
||||
Data is conditionally ignored.
|
||||
Data will be ignored only when it occurs where the meta-DTD
|
||||
does not allow it.
|
||||
<DT>
|
||||
<SAMP>ArcIgnD</SAMP>
|
||||
<DD>
|
||||
Data is always ignored.
|
||||
</DL>
|
||||
<P>
|
||||
The value may also be implied, in which case the state of
|
||||
architectural processing is inherited.
|
||||
If no the document element has no value specified,
|
||||
<SAMP>cArcIgnD</SAMP> will be used.
|
||||
<DT>
|
||||
<SAMP>ArcBridF</SAMP>
|
||||
<DD>
|
||||
The name of a default element type declared in a meta-DTD,
|
||||
to which elements in the document should be automatically mapped
|
||||
if they have an ID and would not otherwise be considered
|
||||
architectural.
|
||||
This would be <SAMP>HyBrid</SAMP> for HyTime.
|
||||
If your meta-DTD declares IDREF attributes, it will
|
||||
usually be appropriate to specify a value for
|
||||
<SAMP>ArcBridF</SAMP>, and to declare an ID attribute
|
||||
for that form in your meta-DTD.
|
||||
<DT>
|
||||
<SAMP>ArcDataF</SAMP>
|
||||
<DD>
|
||||
The name of a default notation declared in the meta-DTD,
|
||||
to which the external data entities in the document
|
||||
should be automatically mapped if they would
|
||||
not otherwise be considered architectural.
|
||||
If this attribute is defined,
|
||||
then general entities will be automatically architectural:
|
||||
any external data entity whose notation cannot otherwise be mapped
|
||||
into a notation in the meta-DTD will be automatically treated
|
||||
as an instance of the <SAMP>ArcDataF</SAMP> notation.
|
||||
This would be <SAMP>data</SAMP> for HyTime.
|
||||
If your meta-DTD declares entity attributes, it will usually
|
||||
be appropriate to specify a value for <SAMP>ArcDataF</SAMP>
|
||||
even if your meta-DTD declares no data attributes for the
|
||||
notation.
|
||||
<DT>
|
||||
<SAMP>ArcAuto</SAMP>
|
||||
<DD>
|
||||
This must have one of the following values:
|
||||
<DL>
|
||||
<DT>
|
||||
<SAMP>ArcAuto</SAMP>
|
||||
<DD>
|
||||
If an element does not have an <SAMP>ArcFormA</SAMP> attribute and the
|
||||
meta-DTD defines an element type with the same name as the element's
|
||||
type, the element will be automatically treated as being an instance
|
||||
of the meta-type. This rule does not apply to the
|
||||
document element type; this is automatically treated as being an
|
||||
instance of the meta-DTD's document element type.
|
||||
Note that this automatic mapping is prevented if
|
||||
the element has an <SAMP>ArcFormA</SAMP> attribute with an implied
|
||||
value. It is also prevented if processing of the
|
||||
<SAMP>ArcFormA</SAMP> attribute is suppressed. This applies equally
|
||||
to the notations of external data entities.
|
||||
The default element or notation specified with the
|
||||
<SAMP>ArcBridF</SAMP> or <SAMP>ArcDfltN</SAMP> attribute
|
||||
is only considered after the mapping specified by <SAMP>ArcAuto</SAMP>.
|
||||
<DT>
|
||||
<SAMP>nArcAuto</SAMP>
|
||||
<DD>
|
||||
Automatic mapping is not performed.
|
||||
</DL>
|
||||
<P>
|
||||
The default value is <SAMP>ArcAuto</SAMP>.
|
||||
<DT>
|
||||
<SAMP>ArcOptSA</SAMP>
|
||||
<DD>
|
||||
A list of names of architectural support attributes,
|
||||
each of which is interpreted as a list of parameter entities
|
||||
to be defined with a replacement text of <SAMP>INCLUDE</SAMP>
|
||||
when parsing the meta-DTD.
|
||||
The default value is <SAMP>ArcOpt</SAMP>.
|
||||
</DL>
|
||||
<H2>Meta-DTDs</H2>
|
||||
<P>
|
||||
A meta-DTD is allowed to use the following extensions:
|
||||
<UL>
|
||||
<LI>
|
||||
a single element type or notation is allowed to be an associated
|
||||
element type or associated notation name for multiple attribute
|
||||
definition lists.
|
||||
<LI>
|
||||
<SAMP>#ALL</SAMP> can be used as an associated element type
|
||||
or associated notation name in an attribute definition list
|
||||
to define attributes for all element types or notations
|
||||
in the meta-DTD
|
||||
</UL>
|
||||
<P>
|
||||
Before any of these extensions can be used, the meta-DTD must include a
|
||||
declaration
|
||||
<PRE>
|
||||
<!AFDR "ISO/IEC 10744:1992">
|
||||
</PRE>
|
||||
<P>
|
||||
This declaration should only be included if the extensions are used.
|
||||
<P>
|
||||
In all other respects a meta-DTD must be a valid SGML DTD.
|
||||
<P>
|
||||
A declared value of ENTITY for an attribute in a meta-DTD means that
|
||||
the value of the attribute must be an entity declared in
|
||||
the (non-meta) DTD that is architectural.
|
||||
An external data entity is architectural only if its notation can be
|
||||
mapped into a notation in the meta-DTD.
|
||||
All other kinds of data entities and subdoc entities are automatically
|
||||
architectural.
|
||||
<P>
|
||||
An IDREF attribute in the meta-document must have a corresponding ID
|
||||
in the meta-document. An attribute with a declared value of ID in the
|
||||
document will be automatically mapped to an attribute with a declared
|
||||
value of ID in the meta-DTD.
|
||||
<P>
|
||||
A declared value of NOTATION in the meta-DTD means that the value of
|
||||
the attribute must have one the values specified in the name group and
|
||||
that it must be a notation in the meta-DTD.
|
||||
(Perhaps if the attribute also has a declared value of NOTATION
|
||||
in the non-meta-DTD, the value should be mapped in a similar
|
||||
way to the notation of an external data entity.)
|
||||
|
||||
<H2>Differences from HyTime</H2>
|
||||
<P>
|
||||
There are a number of differences from how architectural processing is
|
||||
defined in the pre-Corringendum version of the HyTime standard.
|
||||
<UL>
|
||||
<LI>
|
||||
The <SAMP>ArcNamrA</SAMP> and <SAMP>ArcFormA</SAMP> attributes are not
|
||||
part of the meta-DTD. Rather they are used by the architecture engine
|
||||
in deriving the meta-document that is validated against the meta-DTD.
|
||||
<LI>
|
||||
The <SAMP>use:</SAMP> conventional comment is not recognized. Instead
|
||||
a single element type is allowed to be an associated element type for
|
||||
multiple attribute definition lists.
|
||||
<LI>
|
||||
The notation and data attributes of an external data entity are
|
||||
treated just like the element type and attributes of an element. The
|
||||
notation of an external data entity is mapped into a notation in the
|
||||
meta-DTD and the data attributes of the entity are mapped onto
|
||||
attributes defined for the meta-DTD notation.
|
||||
<LI>
|
||||
<SAMP>#FIXED</SAMP> has the same meaning in a meta-DTD that it does in
|
||||
a regular DTD: the value of the attribute must be the same as the
|
||||
default value of the attribute specified in the meta-DTD.
|
||||
</UL>
|
||||
|
||||
<H2>Specifying architectural processing with an LPD</H2>
|
||||
<P>
|
||||
Link attributes defined by an implicit link process are treated in the
|
||||
same way as non-link attributes. The only complication is that SGML
|
||||
allows link attributes to have the same name as non-link attributes.
|
||||
If there is a link attribute and a non-link attribute with the same
|
||||
name, the architecture engine will only look at the link attribute,
|
||||
even if the value of the link attribute is implied. The only
|
||||
exception is the <SAMP>ArcNamrA</SAMP> attribute: the architecture
|
||||
engine will use both the link attribute and the non-link attribute,
|
||||
but the substitute names in the value of the non-link attribute cannot
|
||||
refer to link attribute names.
|
||||
<P>
|
||||
The <SAMP>-A <VAR>archname</VAR></SAMP> option automatically activates
|
||||
any link type <SAMP><VAR>archname</VAR></SAMP>.
|
||||
<P>
|
||||
The architecture notation declaration and associated attribute
|
||||
definition list declaration are allowed in the LPD. Although the
|
||||
productions of ISO 8879 do not allow a notation declaration in a link
|
||||
type declaration subset, it is clearly the intent of the standard that
|
||||
they be allowed. You can use a <SAMP>-wlpd-notation</SAMP> option to
|
||||
disallow them.
|
||||
|
||||
<H2>Notation set architecture</H2>
|
||||
<P>
|
||||
An architecture for which <VAR>archname</VAR> is declared
|
||||
as a notation with a public identifier of
|
||||
<PRE>
|
||||
"ISO/IEC 10744//NOTATION AFDR ARCBASE
|
||||
Notation Set Architecture Definition Document//EN"
|
||||
</PRE>
|
||||
<P>
|
||||
is special. The element types in the meta-DTD for this architecture
|
||||
are the notations of the document DTD and the attributes defined for
|
||||
the element types in the meta-DTD are the data attributes defined for
|
||||
the notations in the document DTD. For each element, the attribute
|
||||
with a declared value of NOTATION performs the function that the
|
||||
ArcFormA attribute performs for normal architectures. Only the
|
||||
<SAMP>ArcNamrA</SAMP> and <SAMP>ArcSuprA</SAMP> architectural support
|
||||
attributes can be used with this architecture.
|
||||
<P>
|
||||
The notation set architecture can also be declared using
|
||||
an architecture base declaration of the form:
|
||||
<PRE>
|
||||
<?ArcBase #NOTATION>
|
||||
</PRE>
|
||||
<P>
|
||||
In this case, no architecture support attributes can be declared;
|
||||
<SAMP>ArcNamrA</SAMP> will be defaulted to <SAMP>notnames</SAMP>,
|
||||
and <SAMP>ArcSuprA</SAMP> to <SAMP>notsupr</SAMP>.
|
||||
|
||||
<H2>Derived architectures</H2>
|
||||
<P>
|
||||
A meta-DTD can have one or more base architectures in the same way as
|
||||
a normal DTD. Multiple <SAMP>-A</SAMP> options can be used to exploit
|
||||
this. For example,
|
||||
<PRE>
|
||||
-A <VAR>arch1</VAR> -A <VAR>arch2</VAR>
|
||||
</PRE>
|
||||
<P>
|
||||
will perform architectural processing on the source document to
|
||||
produce an architectural document conforming to the architecture
|
||||
<SAMP><VAR>arch1</VAR></SAMP> declared in the source document, and
|
||||
will then perform architectural processing on this architectural
|
||||
document to produce an architectural document conforming to the
|
||||
<SAMP><VAR>arch2</VAR></SAMP> architecture declared in
|
||||
<SAMP><VAR>arch1</VAR></SAMP>'s meta-DTD.
|
||||
<P>
|
||||
A document that is validated against a meta-DTD will automatically
|
||||
be validated against any base architectures of that meta-DTD.
|
||||
|
||||
<H2>Not implemented</H2>
|
||||
<P>
|
||||
The following features in the current AFDR draft are not implemented:
|
||||
<UL>
|
||||
<LI>
|
||||
<SAMP>ArcIndr</SAMP> architectural support attribute with value
|
||||
other than <SAMP>nArcIndr</SAMP>.
|
||||
</UL>
|
||||
<P>
|
||||
<ADDRESS>
|
||||
James Clark<BR>
|
||||
jjc@jclark.com
|
||||
</ADDRESS>
|
||||
</BODY>
|
||||
</HTML>
|
||||
104
cde/programs/nsgmls/doc/build.htm
Normal file
104
cde/programs/nsgmls/doc/build.htm
Normal file
@@ -0,0 +1,104 @@
|
||||
<!-- $XConsortium: build.htm /main/1 1996/09/22 18:14:41 rws $ -->
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<TITLE>Building SP</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
<H1>Building SP</H1>
|
||||
<P>
|
||||
You will need a C++ compiler with good template support to build this.
|
||||
Support for exceptions is not required.
|
||||
<P>
|
||||
In most cases you should be able to port to a new compiler just by
|
||||
editing <code>include/config.h</code>.
|
||||
|
||||
<H2>Unix</H2>
|
||||
<P>
|
||||
To build on Unix, edit the Makefile, and do a make. You can also
|
||||
build in a different directory. This requires GNU make or another
|
||||
make that implements VPATH. Copy or link the top-level Makefile to
|
||||
the build directory, change srcdir in the Makefile to point to the
|
||||
original directory, and do a make in the build directory.
|
||||
<P>
|
||||
<SAMP>make check</SAMP> runs some tests. You shouldn't get any reports
|
||||
of differences.
|
||||
<P>
|
||||
<SAMP>make install</SAMP> installs the programs; `make install-man'
|
||||
installs the man pages.
|
||||
<P>
|
||||
You can use the following compilers:
|
||||
<DL>
|
||||
<DT>
|
||||
gcc
|
||||
<DD>
|
||||
gcc 2.7.2 works (gcc 2.7.0 won't work at least on the sparc). You
|
||||
will also an iostream library (eg as provided by libg++ 2.7). This
|
||||
distribution builds on Solaris 2.3 and on Linux 1.2. I expect it will
|
||||
build on SunOS 4 as well with little difficulty.
|
||||
<P>
|
||||
With gcc 2.6.3/SunOS 4, you'll need to compile with
|
||||
<CODE>-Dsig_atomic_t=int</CODE>, and, if you want to compile with
|
||||
-DSP_HAVE_SOCKET, you'll need to make netdb.h and arpa/inet.h C++
|
||||
compatible.
|
||||
<DT>
|
||||
Sun C++
|
||||
<DD>
|
||||
To compile with Sun C++ 4.0.1, run first sunfix.sh. Also in the
|
||||
top-level Makefile, change set libMakefile to Makefile.lib.sun.
|
||||
This makes the library build use the -xar option.
|
||||
</DL>
|
||||
<P>
|
||||
Nelson Beebe has ported SP to a variety of other Unix systems and has
|
||||
produced some <A
|
||||
HREF="http://www.math.utah.edu/~beebe/sp-notes-1.0.1.html">notes</A>
|
||||
about his experiences.
|
||||
|
||||
<H2>DOS/Windows</H2>
|
||||
<P>
|
||||
You must use a compiler that generates 32-bit code.
|
||||
|
||||
<H3></H3>
|
||||
<P>
|
||||
The following compilers have been tested:
|
||||
<DL>
|
||||
<DT>
|
||||
Visual C++ 4.1
|
||||
<DD>
|
||||
Open SP.mak as a Makefile in the Developer Studio and build whatever
|
||||
you want.
|
||||
Don't use <SAMP>Batch Build</SAMP> or <SAMP>Rebuild All</SAMP>: these
|
||||
rebuild the library repeatedly.
|
||||
You can build all the targets in a particular configuration by
|
||||
building the all target.
|
||||
The <SAMP>sp-generate.mak</SAMP> makefile can be used to make
|
||||
all the .cxx and .h files that are automatically generated.
|
||||
(These are included in the distribution, so you don't need to do this
|
||||
unless you want to modify SP.)
|
||||
<P>
|
||||
To create a new program, make a new project in the SP project
|
||||
workspace using the <SAMP>Build>Subprojects</SAMP> command, and
|
||||
include <SAMP>lib</SAMP> and maybe <SAMP>generic</SAMP> as
|
||||
subprojects. You may also want to add your project as a subproject to
|
||||
<SAMP>all</SAMP>.
|
||||
Then, in <SAMP>Build>Settings</SAMP> under the <SAMP>C/C++</SAMP>
|
||||
tab in the <SAMP>Preprocessor</SAMP> category, copy the
|
||||
<SAMP>Preprocessor definitions</SAMP> and <SAMP>Additional include
|
||||
directories</SAMP> entries from the nsgmls subproject.
|
||||
In the <SAMP>Code Generation</SAMP> category make sure you've selected
|
||||
the same run-time library as that used by the corresponding configuration
|
||||
of <SAMP>lib</SAMP>.
|
||||
<DT>
|
||||
Watcom C++ 10.5a
|
||||
<DD>
|
||||
Use Makefile.wat.
|
||||
<P>
|
||||
You must compile on a platform that supports long filenames.
|
||||
</DL>
|
||||
<P>
|
||||
<ADDRESS>
|
||||
James Clark<BR>
|
||||
jjc@jclark.com
|
||||
</ADDRESS>
|
||||
</BODY>
|
||||
</HTML>
|
||||
166
cde/programs/nsgmls/doc/catalog.htm
Normal file
166
cde/programs/nsgmls/doc/catalog.htm
Normal file
@@ -0,0 +1,166 @@
|
||||
<!-- $XConsortium: catalog.htm /main/1 1996/09/22 18:14:58 rws $ -->
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<TITLE>SP - Catalogs</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
<H1>Catalogs</H1>
|
||||
<P>
|
||||
The entity manager generates a system identifier for every external
|
||||
entity using catalog entry files in the format defined by <A
|
||||
HREF="http://www.sgmlopen.org/sgml/docs/library/9401.htm">SGML Open
|
||||
Technical Resolution TR9401:1995</A>. The entity manager will give an
|
||||
error if it is unable to generate a system identifier for an external
|
||||
entity. Normally if the external identifier for an entity includes a
|
||||
system identifier then the entity manager will use that as the
|
||||
effective system identifier for the entity; this behaviour can be
|
||||
changed using <CODE>OVERRIDE</CODE> or <CODE>SYSTEM</CODE> entries in
|
||||
a catalog entry file.
|
||||
<P>
|
||||
A catalog entry file contains a sequence of entries in one of the
|
||||
following forms:
|
||||
<DL>
|
||||
<DT>
|
||||
<SAMP>PUBLIC <VAR>pubid</VAR> <VAR>sysid</VAR></SAMP>
|
||||
<DD>
|
||||
This specifies that <SAMP><VAR>sysid</VAR></SAMP> should be used as
|
||||
the effective system identifier if the public identifier is
|
||||
<SAMP><VAR>pubid</VAR></SAMP>. <SAMP><VAR>Sysid</VAR></SAMP> is a
|
||||
system identifier as defined in ISO 8879 and
|
||||
<SAMP><VAR>pubid</VAR></SAMP> is a public identifier as defined in ISO
|
||||
8879.
|
||||
<DT>
|
||||
<SAMP>ENTITY <VAR>name</VAR> <VAR>sysid</VAR></SAMP>
|
||||
<DD>
|
||||
This specifies that <VAR>sysid</VAR> should be used as the effective
|
||||
system identifier if the entity is a general entity whose name is
|
||||
<VAR>name</VAR>.
|
||||
<DT>
|
||||
<SAMP>ENTITY %<VAR>name</VAR> <VAR>sysid</VAR></SAMP>
|
||||
<DD>
|
||||
This specifies that <SAMP><VAR>sysid</VAR></SAMP> should be used as
|
||||
the effective system identifier if the entity is a parameter entity
|
||||
whose name is <VAR>name</VAR>. Note that there is no space between
|
||||
the <SAMP>%</SAMP> and the <SAMP><VAR>name</VAR></SAMP>.
|
||||
<DT>
|
||||
<SAMP>DOCTYPE <VAR>name</VAR> <VAR>sysid</VAR></SAMP>
|
||||
<DD>
|
||||
This specifies that <SAMP><VAR>sysid</VAR></SAMP> should be used as
|
||||
the effective system identifier if the entity is an entity declared in
|
||||
a document type declaration whose document type name is <VAR>name</VAR>.
|
||||
<DT>
|
||||
<SAMP>LINKTYPE <VAR>name</VAR> <VAR>sysid</VAR></SAMP>
|
||||
<DD>
|
||||
This specifies that <SAMP><VAR>sysid</VAR></SAMP> should be used as the
|
||||
effective system identifier if the entity is an entity declared in a
|
||||
link type declaration whose link type name is <VAR>name</VAR>.
|
||||
<DT>
|
||||
<SAMP>NOTATION <VAR>name</VAR> <VAR>sysid</VAR></SAMP>
|
||||
<DD>
|
||||
This specifies that <SAMP><VAR>sysid</VAR></SAMP> should be used as
|
||||
the effective system identifier for a notation whose name is
|
||||
<SAMP><VAR>name</VAR></SAMP>. This is an extension to the SGML Open
|
||||
format. This is relevant only with the <SAMP>-n</SAMP> option.
|
||||
<DT>
|
||||
<SAMP>OVERRIDE <VAR>bool</VAR></SAMP>
|
||||
<DD>
|
||||
<SAMP><VAR>bool</VAR></SAMP> may be <SAMP>YES</SAMP> or
|
||||
<SAMP>NO</SAMP>. This sets the overriding mode for entries up to the
|
||||
next occurrence of OVERRIDE or the end of the catalog entry file. At
|
||||
the beginning of a catalog entry file the overriding mode will be NO.
|
||||
A PUBLIC, ENTITY, DOCTYPE, LINKTYPE or NOTATION entry with an
|
||||
overriding mode of YES will be used whether or not the external
|
||||
identifier has an explicit system identifier; those with an overriding
|
||||
mode of NO will be ignored if external identifier has an explicit
|
||||
system identifier. This is an extension to the SGML Open format.
|
||||
<DT>
|
||||
<SAMP>SYSTEM <VAR>sysid1</VAR> <VAR>sysid2</VAR></SAMP>
|
||||
<DD>
|
||||
This specifies that <VAR>sysid2</VAR> should be used as the effective
|
||||
system identifier if the system identifier specified in the external
|
||||
identifier was <SAMP><VAR>sysid1</VAR></SAMP>. This is an extension
|
||||
to the SGML Open format. <VAR>sysid2</VAR> should always be quoted to
|
||||
ensure that it is not misinterpreted when parsed by a system that does
|
||||
not support this extension.
|
||||
<DT>
|
||||
<A NAME="sgmldecl"><SAMP>SGMLDECL <VAR>sysid</VAR></SAMP></A>
|
||||
<DD>
|
||||
This specifies that if the document does not contain an SGML declaration,
|
||||
the SGML declaration in <SAMP><VAR>sysid</VAR></SAMP> should be implied.
|
||||
<DT>
|
||||
<SAMP>DOCUMENT <VAR>sysid</VAR></SAMP>
|
||||
<DD>
|
||||
This specifies that the document entity is <SAMP><VAR>sysid</VAR></SAMP>.
|
||||
This entry is used only with the <SAMP>-C</SAMP> option.
|
||||
<DT>
|
||||
<SAMP>CATALOG <VAR>sysid</VAR></SAMP>
|
||||
<DD>
|
||||
This specifies that <SAMP><VAR>sysid</VAR></SAMP> is the system
|
||||
identifier of an additional catalog entry file to be read after this
|
||||
one. Multiple <SAMP>CATALOG</SAMP> entries are allowed and will be
|
||||
read in order. This is an extension to the SGML Open format.
|
||||
<DT>
|
||||
<SAMP>BASE <VAR>sysid</VAR></SAMP>
|
||||
<DD>
|
||||
This specifies that relative storage object identifiers in system
|
||||
identifiers in the catalog entry file following this entry should be
|
||||
resolved using first storage object identifier in
|
||||
<SAMP><VAR>sysid</VAR></SAMP> as the base, instead of the storage
|
||||
object identifiers of the storage objects comprising the catalog entry
|
||||
file. This is an extension to the SGML Open format. This extension
|
||||
is proposed in <A HREF=
|
||||
"ftp://ftp.internic.net/internet-drafts/draft-ietf-mimesgml-exch-02.txt">Using
|
||||
SGML Open Catalogs and MIME to Exchange SGML Documents</A>.
|
||||
Note that the <CODE><VAR>sysid</VAR></CODE> must exist.
|
||||
<DT>
|
||||
<SAMP>DELEGATE <VAR>pubid-prefix</VAR> <VAR>sysid</VAR></SAMP>
|
||||
<DD>
|
||||
This specifies that entities with a public identifier that has
|
||||
<SAMP><VAR>pubid-prefix</VAR></SAMP> as a prefix should be resolved
|
||||
using a catalog whose system identfier is
|
||||
<SAMP><VAR>sysid</VAR></SAMP>. For more details, see <A
|
||||
HREF="http://www.entmp.org/fpi-urn/delegate.html">A Proposal for
|
||||
Delegating SGML Open Catalogs</A>. This is an extension to the SGML
|
||||
Open format.
|
||||
</DL>
|
||||
<P>
|
||||
The delimiters can be omitted from the <SAMP><VAR>sysid</VAR></SAMP>
|
||||
provided it does not contain any white space. Comments are allowed
|
||||
between parameters delimited by <SAMP>--</SAMP> as in SGML.
|
||||
<P>
|
||||
The environment variable <SAMP>SGML_CATALOG_FILES</SAMP> contains a
|
||||
list of catalog entry files. The list is separated by colons under
|
||||
Unix and by semi-colons under MS-DOS and Windows.. These will be
|
||||
searched after any catalog entry files specified using the
|
||||
<SAMP>-m</SAMP> option, and after the catalog entry file called
|
||||
<SAMP>catalog</SAMP> in the same place as the document entity. If
|
||||
this environment variable is not set, then a system dependent list of
|
||||
catalog entry files will be used. In fact catalog entry files are not
|
||||
restricted to being files: the name of a catalog entry file is
|
||||
interpreted as a system identifier.
|
||||
<P>
|
||||
A match in one catalog entry file will take precedence over any match
|
||||
in a later catalog entry file. A more specific matching entry in one
|
||||
catalog entry file will take priority over a less specific matching
|
||||
entry in the same catalog entry file. For this purpose, the order of
|
||||
specificity is (most specific first):
|
||||
<UL>
|
||||
<LI>
|
||||
<SAMP>SYSTEM</SAMP> entries;
|
||||
<LI>
|
||||
<SAMP>PUBLIC</SAMP> entries;
|
||||
<LI>
|
||||
<SAMP>DELEGATE</SAMP> entries ordered by the length of the prefix,
|
||||
longest first;
|
||||
<LI>
|
||||
<SAMP>ENTITY</SAMP>, <SAMP>DOCTYPE</SAMP>, <SAMP>LINKTYPE</SAMP> and
|
||||
<SAMP>NOTATION</SAMP> entries.
|
||||
</UL>
|
||||
<P>
|
||||
<ADDRESS>
|
||||
James Clark<BR>
|
||||
jjc@jclark.com
|
||||
</ADDRESS>
|
||||
</BODY>
|
||||
</HTML>
|
||||
139
cde/programs/nsgmls/doc/features.htm
Normal file
139
cde/programs/nsgmls/doc/features.htm
Normal file
@@ -0,0 +1,139 @@
|
||||
<!-- $XConsortium: features.htm /main/1 1996/09/22 18:15:17 rws $ -->
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<TITLE>SP - Features Summary</TITLE>
|
||||
<BODY>
|
||||
<H1>
|
||||
SP
|
||||
</H1>
|
||||
<H3>
|
||||
A free, object-oriented toolkit for SGML parsing and entity management
|
||||
</H3>
|
||||
<H2>
|
||||
Features summary
|
||||
</H2>
|
||||
<UL>
|
||||
<LI>
|
||||
Includes nsgmls
|
||||
<UL>
|
||||
<LI>
|
||||
Compatible with sgmls
|
||||
<LI>
|
||||
Also generates RAST (ISO/IEC 13673)
|
||||
</UL>
|
||||
<LI>
|
||||
Provides access to all information about SGML document
|
||||
<UL>
|
||||
<LI>
|
||||
Access to DTD and SGML declaration as well as document instance
|
||||
<LI>
|
||||
Access to markup as well as abstract document
|
||||
<LI>
|
||||
Sufficient to recreate character-for-character identical
|
||||
copy of any SGML document
|
||||
</UL>
|
||||
<LI>
|
||||
Supports almost all optional SGML features
|
||||
<UL>
|
||||
<LI>
|
||||
Arbitrary concrete syntaxes
|
||||
<LI>
|
||||
SHORTTAG, OMITTAG, RANK
|
||||
<LI>
|
||||
SUBDOC
|
||||
<LI>
|
||||
LINK (SIMPLE, IMPLICIT and EXPLICIT)
|
||||
<LI>
|
||||
Only DATATAG and CONCUR not supported
|
||||
</UL>
|
||||
<LI>
|
||||
Sophisticated entity manager
|
||||
<UL>
|
||||
<LI>
|
||||
Supports ISO/IEC 10744 Formal System Identifiers
|
||||
<LI>
|
||||
Supports SGML Open catalogs
|
||||
<LI>
|
||||
Supports WWW
|
||||
<LI>
|
||||
Can be used independently of parser
|
||||
</UL>
|
||||
<LI>
|
||||
Supports multi-byte character sets
|
||||
<UL>
|
||||
<LI>
|
||||
Parser can use 16-bit characters internally
|
||||
<LI>
|
||||
16-bit characters can be used in tag names and other markup
|
||||
<LI>
|
||||
Supports ISO/IEC 10646 (Unicode) using both UCS-2 and UTF-8
|
||||
<LI>
|
||||
Supports Japanese character sets (Shift-JIS, EUC)
|
||||
</UL>
|
||||
<LI>
|
||||
Object-oriented
|
||||
<LI>
|
||||
Written in C++ from scratch
|
||||
<UL>
|
||||
<LI>
|
||||
Not a modified version of a parser originally written in C
|
||||
<LI>
|
||||
Reentrant
|
||||
<LI>
|
||||
Sophisticated architecture
|
||||
</UL>
|
||||
<LI>
|
||||
Fast
|
||||
<UL>
|
||||
<LI>
|
||||
Up to twice as fast as sgmls on large documents
|
||||
</UL>
|
||||
<LI>
|
||||
Portable
|
||||
<UL>
|
||||
<LI>
|
||||
All major Unix variants
|
||||
<LI>
|
||||
MS-DOS
|
||||
<LI>
|
||||
Win32: Windows 95/Windows NT
|
||||
<LI>
|
||||
OS/2
|
||||
</UL>
|
||||
<LI>
|
||||
Production quality
|
||||
<UL>
|
||||
<LI>
|
||||
Version 1.0 recently released, after a year of test releases
|
||||
<LI>
|
||||
Tested using several SGML test suites
|
||||
<LI>
|
||||
Already used in several new commercial products
|
||||
<LI>
|
||||
Written by James Clark, previously responsible for turning arcsgml into sgmls
|
||||
</UL>
|
||||
<LI>
|
||||
Free
|
||||
<UL>
|
||||
<LI>
|
||||
Includes source code
|
||||
<LI>
|
||||
No restrictions on commercial use
|
||||
</UL>
|
||||
<LI>
|
||||
Disadvantages
|
||||
<UL>
|
||||
<LI>
|
||||
Programmer-level documentation only for generic API
|
||||
and not for native API.
|
||||
</UL>
|
||||
</UL>
|
||||
|
||||
<P>
|
||||
<ADDRESS>
|
||||
James Clark<BR>
|
||||
jjc@jclark.com
|
||||
</ADDRESS>
|
||||
</BODY>
|
||||
</HTML>
|
||||
1082
cde/programs/nsgmls/doc/generic.htm
Normal file
1082
cde/programs/nsgmls/doc/generic.htm
Normal file
File diff suppressed because it is too large
Load Diff
489
cde/programs/nsgmls/doc/ideas.htm
Normal file
489
cde/programs/nsgmls/doc/ideas.htm
Normal file
@@ -0,0 +1,489 @@
|
||||
<!-- $XConsortium: ideas.htm /main/1 1996/09/22 18:15:57 rws $ -->
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<TITLE>Ideas for improving SP</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
<H1>Ideas for improving SP</H1>
|
||||
<H2>
|
||||
Parser
|
||||
</H2>
|
||||
<P>
|
||||
Have option (fixedDocCharset) in which document charcater set cannot
|
||||
be changed by SGML declaration; declared document character set used
|
||||
for character references, and to determine which characters are
|
||||
non-SGML. Would need separate event for non-SGML character.
|
||||
In Text would need separate TextItem for non-SGML data.
|
||||
Disallow non-SGML charcters in internal entities.
|
||||
<P>
|
||||
Supporting caching across multiple runs of parser in single
|
||||
process.
|
||||
<P>
|
||||
Make Dtd copiable.
|
||||
<P>
|
||||
?Subdoc parser needs character set for system id (should be system
|
||||
character set).
|
||||
<P>
|
||||
Recover better from non-existent documents or subdocuments.
|
||||
<P>
|
||||
Think about entity declarations/references in inactive LPDs.
|
||||
<P>
|
||||
Don't allow name groups in parameter entity references in document
|
||||
type specifications in start-/end-tags.
|
||||
<P>
|
||||
With link, don't do a pass 2 unless we replace a referenced entity
|
||||
(what about default entity?).
|
||||
<P>
|
||||
Options to warn about things that HTML disallows: marked sections in
|
||||
instance, explicit subsets.
|
||||
<P>
|
||||
Option to warn about MDCs in comments in comment declarations.
|
||||
<P>
|
||||
Option to warn about omitted REFC.
|
||||
<P>
|
||||
Check that names of added functions are valid names in concrete syntax
|
||||
(both characters and lengths). Also need to do upper-case
|
||||
substitution on them?
|
||||
<P>
|
||||
Recover from nested doctype declaration intelligently.
|
||||
<P>
|
||||
Recover from missing doctype declaration intelligently.
|
||||
<P>
|
||||
Could optimize parsing of attribute literals using technique similar
|
||||
to extendData().
|
||||
<P>
|
||||
attributeValueLength error should give actual length of value.
|
||||
<P>
|
||||
Recover better from entity reference with name group in literal.
|
||||
<P>
|
||||
At start of pass 2 clear everything in pass1LPDs except entity sets.
|
||||
<P>
|
||||
Give an error if EXPLICIT > 1 and LPDs don't chain as required by
|
||||
436:5-7 and 436:18-20.
|
||||
<P>
|
||||
Handle quantity errors by reporting at the end of the prolog and the
|
||||
end of the instance any quantities that need to be increased.
|
||||
<P>
|
||||
Make noSuchReservedName error more helpful.
|
||||
<P>
|
||||
Function characters should perform their function even when markup
|
||||
recognition is suppressed. (I think I've handled this.)
|
||||
<P>
|
||||
Give a warning for notation attribute that is #CONREF.
|
||||
<P>
|
||||
Try to separate out Parser::compileModes().
|
||||
<P>
|
||||
In CompiledModelGroup have vector that gives an index for each element type
|
||||
that occurs in the model group. Then in each leaf content token have a
|
||||
vector that maps this index to a LeafContentToken *, if there
|
||||
is a simple transition (no and groups involved) to that element type.
|
||||
<P>
|
||||
MatchState::minAndDepth and MatchState::AndInfo should be separated
|
||||
off info object pointed to from MatchState; pointer would be null for
|
||||
elements with no AND groups.
|
||||
<P>
|
||||
What to do if we encounter USELINK or USEMAP declaration after DTD in
|
||||
prolog? Should stop prolog and start DTD. If we have SCOPE INSTANCE
|
||||
then if we get an unknown declaration type in prolog, don't give
|
||||
error, but unget token and start instance.
|
||||
<P>
|
||||
?Have separate version of reportNonSgml() for case where datachar is allowed.
|
||||
<P>
|
||||
Implement CONCUR.
|
||||
<P>
|
||||
AttributeDefinition constructors should have Owner<DeclaredValue> &,
|
||||
arguments to avoid storage leaks when exceptions are thrown.
|
||||
<P>
|
||||
Create a list like IList but which keeps track of length. Then
|
||||
combine tagLevel into openElement stack, and inputLevel into
|
||||
inputStack.
|
||||
<P>
|
||||
AttributeDefinition::makeValue should return
|
||||
ConstResourcePointer<AttributeValue>.
|
||||
<P>
|
||||
Syntax member functions should use reference for result.
|
||||
<P>
|
||||
Have a LocationKey data structure that can be used to determine the
|
||||
relative order of locations in possibly different concurrent
|
||||
instances. Contains: offset in document instance; is it a replacement
|
||||
of named character reference; for each entity and numeric character
|
||||
reference: location in entity and index of dtd in which instance is
|
||||
declared.
|
||||
<P>
|
||||
On systems with fixed stacks, avoid unlimited stack growth: hard
|
||||
limits on number of SUBDOCS and GRPLVL.
|
||||
<P>
|
||||
With extendData and extendS don't extend more than some fixed amount
|
||||
(eg 1024), otherwise could overrun InputSource buffer on 16-bit
|
||||
system.
|
||||
<P>
|
||||
Have a location in ElementType saying where the first mention of the
|
||||
element name was. Useful for giving warnings about undefined
|
||||
elements.
|
||||
<P>
|
||||
How to detect 310:8-10?
|
||||
<P>
|
||||
AttributeSemantics should return const pointers rather than ResourcePointer's
|
||||
<P>
|
||||
Rename Parser -> ParserImpl SgmlParser -> Parser
|
||||
Syntax::isB -> Syntax::isBlank
|
||||
<P>
|
||||
What mode should be used for parsing other prolog after document element?
|
||||
<P>
|
||||
Flag out of context data.
|
||||
<P>
|
||||
Provide mechanism to allow character names to be mapped onto universal
|
||||
character numbers.
|
||||
<P>
|
||||
Provide mechanism to allow specification of wbat characters are
|
||||
control characters (for the purposes of SHUNCHAR controls).
|
||||
<P>
|
||||
With SCOPE INSTANCE, which syntax should be used for delimiters in
|
||||
bracketed text entities?
|
||||
<P>
|
||||
Better error messages for ambiguous delimiters.
|
||||
<P>
|
||||
Do we need both EndLpd and ComplexLink/SimpleLink events?
|
||||
<P>
|
||||
What to do about 457:19-21?
|
||||
<P>
|
||||
Rename lpd_ to activeLpd_; allLpd_ to lpd_.
|
||||
<P>
|
||||
Test for validity of character numbers in syn ref charset (perhaps
|
||||
unnecessary, because bad numbers won't be translateable into doc
|
||||
charset).
|
||||
<P>
|
||||
Option to read bootstrap character set from entity.
|
||||
<P>
|
||||
In AttributeDefinitionList have a flag that is true if any checking of
|
||||
unspecified values in attribute list is needed (ie CURRENT, REQUIRED,
|
||||
non-implied ENTITY, non-implied NOTATION). In this case can avoid
|
||||
running over attributes in AttributeList::finish, by computing value
|
||||
only when user calls Attribute::value().
|
||||
<P>
|
||||
Construct link attributes from definition if no applicable link rule.
|
||||
(RAST maybe doesn't want this. Make it a separate method in LinkProcess and
|
||||
use in SgmlsEventHandler. Very useful with ArcEngine.)
|
||||
<P>
|
||||
Shouldn't have OpenElementInfo in Message. Instead use RTTI.
|
||||
<P>
|
||||
noSuchAttribute: include gi in message; if element is undefined, don't
|
||||
give error at all
|
||||
<P>
|
||||
noSuchAttributeToken: say what element or entity
|
||||
<P>
|
||||
nonExistentEntityRef should say document/link type
|
||||
<P>
|
||||
Distinguish errors that are totally recoverable.
|
||||
<P>
|
||||
Find better way to unpack entity information in entity attribute.
|
||||
|
||||
<H2>
|
||||
Entity Manager
|
||||
</H2>
|
||||
<P>
|
||||
Avoid requiring that BASE sysid exist.
|
||||
<P>
|
||||
When FSI has only a single storage manager and that is a literal,
|
||||
return an InternalInputSource.
|
||||
<P>
|
||||
Allow user of InputSource to specify what bit combinations they
|
||||
want to see for RS and RE.
|
||||
<P>
|
||||
Have environment variable SP_INPUT_BCTF that overrides SP_BCTF for
|
||||
input.
|
||||
<P>
|
||||
Avoid using numeric character references for all characters in storage
|
||||
object identifier of literal storage manager in effective system
|
||||
identifier.
|
||||
<P>
|
||||
Instead of registering coding system pass CodingSystemKit that can create
|
||||
that can create coding systems.
|
||||
<P>
|
||||
Need BCTF entry in catalog that specifies default BCTF.
|
||||
<P>
|
||||
Have catalog entry that describes internet charset as BCTF plus PUBLIC
|
||||
identifier of SGML character set; then have charset= storage attribute
|
||||
that does the translation.
|
||||
<P>
|
||||
An SOEntityCatalog should consist of a Vector<ConstPtr<EntityCatalog>
|
||||
> which can be shared between several catalogs. This would facilitate
|
||||
> caching.
|
||||
<P>
|
||||
Maybe need to be able to specify two types of catalog entry file: one
|
||||
used for all documents; one used for this document alone.
|
||||
<P>
|
||||
Allow end-tags in FSIs. Support alternative SOSs.
|
||||
<P>
|
||||
Character sets in the catalog need rethinking. Also character set of
|
||||
ParsedSystemId::Map::publicId.
|
||||
<P>
|
||||
Allow for HTTP proxy.
|
||||
<P>
|
||||
Cache catalogs.
|
||||
<P>
|
||||
Use Microsoft ActiveX (formerly Sweeper) DLL on Win95 or NT.
|
||||
<P>
|
||||
Implement DTDDECL catalog entry.
|
||||
<P>
|
||||
Support FILE URLs.
|
||||
<P>
|
||||
Perhaps don't want to do searching for catalog files (and perhaps
|
||||
command line files).
|
||||
<P>
|
||||
Provide mechanism for specifying when (if at all) base dir is searched
|
||||
relative to other dirs.
|
||||
<P>
|
||||
Provide extension to catalog format to distinguish entities declared
|
||||
in non-base DTDs. Perhaps precede entity name by document type name
|
||||
surrounded by GRPO/GRPC delimiters.
|
||||
<P>
|
||||
URLStorageManager should use a DescriptorManager shared with
|
||||
PosixStorageManager.
|
||||
<P>
|
||||
URLStorageManager::resolveRelative should delete "xxx/../" and "./"
|
||||
components. Might also be a good idea to resolve host names.
|
||||
<P>
|
||||
Implement JIS encoding system (what should be done with half-width yen
|
||||
and overbar in JIS-Roman? translate to Unicode).
|
||||
<P>
|
||||
ExternalInfoImpl::convertOffset: when the position is the character
|
||||
past the last character and the last character was a newline, line
|
||||
number should be number of lines + 1.
|
||||
<P>
|
||||
Try harder to rewind in StdioStorageObject.
|
||||
<P>
|
||||
charset= storage attribute that infers BCTF from MIME charset assuming
|
||||
10646 document character set.
|
||||
|
||||
<H2>
|
||||
Generic
|
||||
</H2>
|
||||
<P>
|
||||
Provide mechanism to access data entities using generated system id.
|
||||
<P>
|
||||
Support IMPLICIT/SIMPLE LINK.
|
||||
<P>
|
||||
Character set information.
|
||||
<P>
|
||||
Need to know space character that separates token. Alternatively
|
||||
provide broken down view of tokens.
|
||||
<P>
|
||||
Need to know IDREF (and other declared values)?
|
||||
|
||||
<H2>
|
||||
nsgmls
|
||||
</H2>
|
||||
<P>
|
||||
Problem with "\#n;" escape sequence is that it might get used other
|
||||
than in data. Probably should get rid of this feature, and give
|
||||
a warning when there's an unencodable character.
|
||||
|
||||
<H2>
|
||||
Internal
|
||||
</H2>
|
||||
<P>
|
||||
Make all macros that occur in headers begin with SP.
|
||||
<P>
|
||||
Make sure all files use #pragma i/i.
|
||||
<P>
|
||||
Get rid of assumption that Vector<T>::size_type, String<T>::size_type
|
||||
is size_t.
|
||||
<P>
|
||||
Maybe align Owner with auto_ptr.
|
||||
<P>
|
||||
Get rid of uses of string as identifier.
|
||||
<P>
|
||||
?Maybe support non-const copy constructors for NCVector/Owner.
|
||||
<P>
|
||||
Get rid of asEntityOrigin (as far as possible). Make
|
||||
InputSourceOrigin::defLocation virtual on origin. Avoid excessive use
|
||||
of asInputSourceOrigin.
|
||||
<P>
|
||||
Hash should define Hash(String<unsigned char>),
|
||||
Hash(String<unsigned short>) etc.
|
||||
<P>
|
||||
Invert sense of SP_HAVE_BOOL define.
|
||||
<P>
|
||||
Get rid of OutputCharStream::open. Instead have
|
||||
OutputCharStream::setEncoding. (Perhaps make a friend so we can use
|
||||
ostream if we're not interested in encodings.) Allow use of ostream
|
||||
instead of OutputCharStream. Change ParserToolkit::errorStream_'s coding
|
||||
system when we change the coding system.
|
||||
<P>
|
||||
Support 32-bit Char. Need to fix XcharMap and SubstTable.
|
||||
Detemplatize SubstTable. Then support UTF-16.
|
||||
<P>
|
||||
Have a common version of Ptr for things that have a virtual
|
||||
destructor.
|
||||
<P>
|
||||
Have a common version of Owner for all things that have a virtual
|
||||
destructor.
|
||||
<P>
|
||||
Inheritance in AttributeSemantics unnecesary.
|
||||
<P>
|
||||
Rename ISet -> RangeSet.
|
||||
<P>
|
||||
ISet and RangeMap should use binary search.
|
||||
<P>
|
||||
Better hash function for wide characters.
|
||||
<P>
|
||||
OutputCharStream should canonically use RS/RE and translate to system
|
||||
newline char with raw option that prevents this.
|
||||
<P>
|
||||
Avoid having Entity.h depend on ParserState, perhaps by double
|
||||
dispatching.
|
||||
<P>
|
||||
Add uses of explicit keyword.
|
||||
<P>
|
||||
When generating message.h file; if we don't have .cxx file and
|
||||
namespaces are supported, use anonymous namespace.
|
||||
|
||||
<H2>
|
||||
Application framework
|
||||
</H2>
|
||||
<P>
|
||||
Only use static programName for outOfMemory message.
|
||||
<P>
|
||||
Need to use AppChar *const * not AppChar ** in CmdLineApp.
|
||||
<P>
|
||||
When reporting message with MessageEventHandler need to be able to
|
||||
update error count.
|
||||
<P>
|
||||
Option argument names need to be internationalized.
|
||||
<P>
|
||||
Support response files for DOS.
|
||||
<P>
|
||||
Sort options in usage message.
|
||||
<P>
|
||||
StringMessageArg should be associated with a character set (in
|
||||
particular, need to distinguish parser character sets from
|
||||
StorageManager character sets).
|
||||
<P>
|
||||
Should translate StringMessageArg from document character set to
|
||||
system character set. Have MessageReporter::setDocumentCharacter
|
||||
function.
|
||||
<P>
|
||||
In MessageReporter, maybe distinguish messages coming from the parser.
|
||||
<P>
|
||||
Don't ever give a non-existent file as a location in a error message.
|
||||
<P>
|
||||
Text of messages should be able to specify that an open quote or close
|
||||
quote should be inserted at a particular point.
|
||||
<P>
|
||||
When outputting a StringMessageArg translate \r to \n.
|
||||
<P>
|
||||
Make sure wild cards work in VC++ and MS-DOS.
|
||||
|
||||
<H2>
|
||||
Win32
|
||||
</H2>
|
||||
<P>
|
||||
Compilers can typically eliminate unused templates. Reengineer Vector
|
||||
to reduce code size with such compilers.
|
||||
<P>
|
||||
Store messages in resources; requires numeric tags for messages.
|
||||
<P>
|
||||
Should automatically register all available code pages.
|
||||
<P>
|
||||
Make use of IsTextUnicode() API.
|
||||
<P>
|
||||
Have StorageManager that uses Win32 API directly. Would avoid limits
|
||||
on number of open files. Also use flag that says file is being
|
||||
accessed sequentially.
|
||||
<P>
|
||||
Allow DTDs to be compiled into binary by having storage manager that
|
||||
uses resource ids.
|
||||
|
||||
<H2>
|
||||
Architecture engine
|
||||
</H2>
|
||||
<P>
|
||||
Should give an error with -A if the specified arch does not exist.
|
||||
<P>
|
||||
Interpret APPINFO parameter, and automatically enable architectural
|
||||
processing based on this.
|
||||
<P>
|
||||
Handle derived architecture support attributes.
|
||||
<P>
|
||||
When doing architectural processing in link type, not possible to have
|
||||
notation declaration, so need some other way to specify public
|
||||
identifier for architecture.
|
||||
<P>
|
||||
Allow DOCTYPE to be declared inline (as with CONCUR or EXPLICIT LINK).
|
||||
<P>
|
||||
Grok conventional comments.
|
||||
<P>
|
||||
Make work automatically with EventHandlers that process subdoc. Make
|
||||
references to subdocs architectural.
|
||||
<P>
|
||||
Support different SGML declaration for meta-DTD.
|
||||
<P>
|
||||
Maybe should map internal sdata/cdata entities to copies in meta-DTD.
|
||||
<P>
|
||||
Perhaps when getting open element info should indicate that gis are
|
||||
architectural.
|
||||
<P>
|
||||
Think about references to SDATA entities in default values in meta-DTD.
|
||||
<P>
|
||||
Add default entity from real DTD to meta-DTD.
|
||||
<P>
|
||||
Tokenize ArcForm attribute appropriately.
|
||||
<P>
|
||||
Make special case for parsing DTD when entity can't be accessed.
|
||||
<P>
|
||||
Try to provide extension that would allow architecture elements be
|
||||
asynchronous with actual elements? This would provide CONCUR
|
||||
functionality.
|
||||
|
||||
<H2>
|
||||
sgmlnorm
|
||||
</H2>
|
||||
<P>
|
||||
Avoid bogus newline from invalid empty document.
|
||||
<P>
|
||||
Avoid always escaping >.
|
||||
<P>
|
||||
Option to say whether to use character references for 8-bit characters.
|
||||
<P>
|
||||
Option to output implied attributes.
|
||||
<P>
|
||||
Option to output all non-implied attributes.
|
||||
<P>
|
||||
Option to omit attribute name with name tokens.
|
||||
<P>
|
||||
Protect against recognition of short references.
|
||||
<P>
|
||||
Option to preserve CDATA entity references.
|
||||
<P>
|
||||
Option to output general entity declarations in DTD subset
|
||||
(but what about data attributes)?
|
||||
|
||||
<H2>
|
||||
spam
|
||||
</H2>
|
||||
<P>
|
||||
Option to normalize names.
|
||||
<P>
|
||||
Add comments round expanded entities to prevent false delimiter
|
||||
recognition.
|
||||
<P>
|
||||
Add newline at the end if last thing was omitted tag.
|
||||
<P>
|
||||
Option to warn about changes in internal entities when not expanding.
|
||||
|
||||
<H2>
|
||||
Documentation
|
||||
</H2>
|
||||
<P>
|
||||
Error message format.
|
||||
<P>
|
||||
<catalog> FSI tag.
|
||||
<P>
|
||||
<ADDRESS>
|
||||
James Clark<BR>
|
||||
jjc@jclark.com
|
||||
</ADDRESS>
|
||||
</BODY>
|
||||
</HTML>
|
||||
94
cde/programs/nsgmls/doc/index.htm
Normal file
94
cde/programs/nsgmls/doc/index.htm
Normal file
@@ -0,0 +1,94 @@
|
||||
<!-- $XConsortium: index.htm /main/1 1996/09/22 18:16:18 rws $ -->
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<TITLE>SP</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
<H1>SP</H1>
|
||||
<H3>
|
||||
An SGML System Conforming to International Standard ISO 8879 --
|
||||
Standard Generalized Markup Language
|
||||
</H3>
|
||||
<P>
|
||||
The following documents are available:
|
||||
<P>
|
||||
<UL>
|
||||
<LI>
|
||||
<A HREF="features.htm">Summary of SP's features</A>
|
||||
<LI>
|
||||
<A HREF="http://www.jclark.com/sp/howtoget.htm">How to get SP</A>
|
||||
<LI>
|
||||
<A HREF="build.htm">How to build and install SP from source</A>
|
||||
<LI>
|
||||
Using SP
|
||||
<UL>
|
||||
<LI>
|
||||
<A HREF="new.htm">What's new in SP?</A>
|
||||
<LI>
|
||||
<A HREF="nsgmls.htm">nsgmls</A>, a replacement for sgmls
|
||||
<LI>
|
||||
<A HREF="sgmlsout.htm">nsgmls output format</A>,
|
||||
an extension to the output format of sgmls
|
||||
<LI>
|
||||
<A HREF="spam.htm">spam</A>, a sophisticated normalizer,
|
||||
perhaps better thought of as a markup stream editor
|
||||
<LI>
|
||||
<A HREF="sgmlnorm.htm">sgmlnorm</A>, a simpler normalizer
|
||||
that focuses on producing the same ESIS rather than
|
||||
preserving details of the markup
|
||||
<LI>
|
||||
<A HREF="spent.htm">spent</A>, a program providing access
|
||||
to SP's entity manager
|
||||
<LI>
|
||||
<A HREF="sysdecl.htm">System declaration</A>
|
||||
<LI>
|
||||
<A HREF="sgmldecl.htm">Handling of SGML declarations</A>
|
||||
<LI>
|
||||
<A HREF="sysid.htm">System identifiers</A>
|
||||
<LI>
|
||||
<A HREF="catalog.htm">Using SGML Open catalogs to generate
|
||||
system identifiers</A>
|
||||
<LI>
|
||||
<A HREF="archform.htm">Architectural form support</A>
|
||||
<LI>
|
||||
<A HREF="winntu.htm">Notes on SP Unicode support under Windows NT</A>
|
||||
</UL>
|
||||
<LI>
|
||||
Programming with SP
|
||||
<UL>
|
||||
<LI>
|
||||
<A HREF="generic.htm">Generic API to SP</A>
|
||||
<LI>
|
||||
<A HREF="ideas.htm">Ideas for improving SP</A>
|
||||
</UL>
|
||||
</UL>
|
||||
<P>
|
||||
There is a mailing list for programmer-level discussions of SP. Mail
|
||||
subscription requests <A
|
||||
HREF="mailto:sp-prog-request@jclark.com">sp-prog-request@jclark.com</A>.
|
||||
Messages for the list should go to <A
|
||||
HREF="mailto:sp-prog@jclark.com">sp-prog@jclark.com</A>.
|
||||
<P>
|
||||
For information about SGML, see
|
||||
<UL>
|
||||
<LI>
|
||||
<A
|
||||
HREF="http://www.sil.org/sgml/sgml.html">The SGML Web Page</A>.
|
||||
<LI>
|
||||
<A HREF="http://www.iso.ch/cate/d16387.html">ISO 8879:1986</A>
|
||||
<LI>
|
||||
The SGML Handbook, Charles F. Goldfarb
|
||||
</UL>
|
||||
<P>
|
||||
I would like to hear about any bugs you find in SP. When reporting a
|
||||
bug, please always include a complete self-contained file that will
|
||||
enable me to reproduce the bug. I am also interested in receiving
|
||||
suggestions for improvements to SP no matter how small.
|
||||
<P>
|
||||
<ADDRESS>
|
||||
James Clark<BR>
|
||||
jjc@jclark.com
|
||||
</ADDRESS>
|
||||
</BODY>
|
||||
</HTML>
|
||||
166
cde/programs/nsgmls/doc/new.htm
Normal file
166
cde/programs/nsgmls/doc/new.htm
Normal file
@@ -0,0 +1,166 @@
|
||||
<!-- $XConsortium: new.htm /main/1 1996/09/22 18:16:37 rws $ -->
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<TITLE>What's new in SP?</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
<H1>What's new?</H1>
|
||||
<P>
|
||||
This document describes recent user-visible changes in SP. Bug fixes
|
||||
are not described.
|
||||
|
||||
<H2>Version 1.1</H2>
|
||||
<P>
|
||||
There is now generalized support for <A
|
||||
HREF="archform.htm">architectural form processing</A>.
|
||||
<P>
|
||||
Documentation is now in HTML format.
|
||||
<P>
|
||||
A BASE catalog entry can be used to specify a base system identifier
|
||||
for resolving relative storage object identifiers occurring in the
|
||||
catalog.
|
||||
<P>
|
||||
A LITERAL storage manager is now provided.
|
||||
<P>
|
||||
Programs have a -E option that sets the maximum number of errors.
|
||||
<P>
|
||||
A DELEGATE catalog entry allows distributed resolution of public
|
||||
identifiers.
|
||||
<P>
|
||||
nsgmls has a -B (batch mode) option that allows you to parse multiple
|
||||
documents with a single invocation of nsgmls.
|
||||
<P>
|
||||
In nsgmls the -c option now specifies a catalog as it does in spam and
|
||||
sgmlnorm, in addition to the -m option that previously did this.
|
||||
<P>
|
||||
The <SAMP>-n</SAMP> option has been replaced by a
|
||||
<SAMP>-onotation-sysid</SAMP> which applies to nsgmls only, and a
|
||||
<SAMP>-wnotation-sysid</SAMP> which applies generally.
|
||||
<P>
|
||||
SP can be built as a DLL under Win32.
|
||||
|
||||
<H2>Version 1.0</H2>
|
||||
<P>
|
||||
The syntax of system identifiers has completely changed. The new
|
||||
syntax is based on the syntax of formal system identifiers defined in
|
||||
ISO/IEC 10744 (HyTime) Technical Corrigendum 1, Annex D.
|
||||
<P>
|
||||
The NSGMLS_CODE environment variable has been renamed to SP_BCTF.
|
||||
nsgmls has a -b option to specify the bit combination transformation
|
||||
format to be used for output.
|
||||
<P>
|
||||
A list of directories in which files specified in system identifiers
|
||||
should be searched for can be specified using the environment variable
|
||||
SGML_SEARCH_PATH or the option -D.
|
||||
<P>
|
||||
Individual SYSTEM identifiers in external identifiers can be
|
||||
overridden using SYSTEM entries in the catalog.
|
||||
<P>
|
||||
The OVERRIDE catalog entry now takes a YES/NO argument. (This change
|
||||
was required for conformance to the SGML Open TR.) It applies to each
|
||||
entry individually rather than to the entire catalog.
|
||||
<P>
|
||||
The -w options of nsgmls and spam have been enhanced. In spam, the -w
|
||||
option takes an argument as with nsgmls. There are new warnings for
|
||||
minimized start and end tags (-wunclosed, -wempty, -wnet and
|
||||
-wmin-tag); for unused short reference maps (-wunused-maps); for
|
||||
unused parameter entities (-wunused-param). -wall now doesn't include
|
||||
those warnings that are about conditions that, in the opinion of the
|
||||
author, there is no reason to avoid. A warning can be turned off by
|
||||
using its name prefixed by no-; thus -wmin-tag -wno-net is equivalent
|
||||
to -wunclosed -wempty. The -w option is also used to turn off errors:
|
||||
-wno-idref replaces the -x option; -wno-significant replaces the -X
|
||||
option.
|
||||
<P>
|
||||
In the output of nsgmls, characters that cannot be represented in the
|
||||
encoding translation specified by the NSGMLS_BCTF environment variable
|
||||
are represented using an escape sequence of the form \#N; when N is a
|
||||
decimal integer.
|
||||
<P>
|
||||
In the multi-byte versions of nsgmls there are new BCTFs is8859-N
|
||||
for N = 1,...,9.
|
||||
<P>
|
||||
There is a -o option to nsgmls which makes it output additional
|
||||
information: -oentity outputs information about all entities; -oid
|
||||
distinguish attributes with a declared value of id; -oincluded
|
||||
distinguishes included subelements.
|
||||
<P>
|
||||
nsgmls now automatically searches for a catalog entry file called
|
||||
"catalog" in the same place as the document entity. Note that when
|
||||
the document entity is specified with a URL, this matches the
|
||||
behaviour of Panorama.
|
||||
<P>
|
||||
A catalog entry file can contain CATALOG entries specifying additional
|
||||
catalog entry files. This matches the behaviour of Panorama.
|
||||
<P>
|
||||
The parser can now make available to an application complete
|
||||
information about the markup of prologs and SGML declarations. It
|
||||
would now be possible, for example, to use SP to write a DTD editor.
|
||||
spam exploits this to a limited extent: if the -p option is specified
|
||||
twice, then parameter entity references between declarations will be
|
||||
expanded; the -mreserved option puts all reserved names in upper-case;
|
||||
with the -mshortref option short reference use declarations and short
|
||||
reference mapping declarations will be removed; attribute
|
||||
specification lists in data attribute specifications in entity
|
||||
declarations can be normalized like attribute specification lists in
|
||||
start-tags; with -mms it resolves IGNORE/INCLUDE marked sections.
|
||||
<P>
|
||||
nsgmls has a -C option which causes the command line filenames to be
|
||||
treated as a catalog whose DOCUMENT entry specifies the document
|
||||
entity.
|
||||
<P>
|
||||
nsgmls has a -n option which causes it to generate system identifiers
|
||||
for notations in the same way as it does for entities.
|
||||
<P>
|
||||
spam now has a -f option like nsgmls.
|
||||
<P>
|
||||
The interface between the parser and entity manager has been
|
||||
redesigned so that the entity manager can be used independently of the
|
||||
parser. This is exploited by a new program called spent that prints
|
||||
an entity with a specified system identifier on the standard output.
|
||||
<P>
|
||||
In most cases, a Control-Z occurring as the last byte in a file will
|
||||
be stripped. This is controlled by the zapeof attribute in formal
|
||||
system identifiers.
|
||||
|
||||
<H2>Version 0.4</H2>
|
||||
<P>
|
||||
External concrete syntaxes, character sets and capacity sets are
|
||||
supported using PUBLIC entries in catalog files. The multicode code
|
||||
core and reference syntaxes are no longer built-in. Only a few
|
||||
character sets are now built-in.
|
||||
<P>
|
||||
Within external concrete syntaxes, various useful extensions are
|
||||
permitted. In particular, an ellipsis syntax is allowed for the
|
||||
specification of name characters and single character short
|
||||
references. It is now practical to specify tens of thousands of
|
||||
additional name characters.
|
||||
<P>
|
||||
The default SGML declaration is more permissive.
|
||||
<P>
|
||||
nsgmls has a -x option that inhibits checking of idrefs.
|
||||
<P>
|
||||
nsgmls has a -w option that can enable additional warnings. In
|
||||
particular, -wmixed will warn about mixed content models that do not
|
||||
allow #pcdata everywhere.
|
||||
<P>
|
||||
The meaning of the f command in the output of nsgmls has changed
|
||||
slightly. It now gives the effective system identifier of the entity.
|
||||
<P>
|
||||
The functionality of the rast program has been merged into the nsgmls
|
||||
program and the rast program has been removed. The -t option makes
|
||||
nsgmls generate a RAST result.
|
||||
<P>
|
||||
spam has a -l option that uses lower-case for added names that were
|
||||
subject to upper-case substitution.
|
||||
<P>
|
||||
spam has a -mcurrent option that adds omitted attribute specifications
|
||||
for current attributes.
|
||||
<P>
|
||||
<ADDRESS>
|
||||
James Clark<BR>
|
||||
jjc@jclark.com
|
||||
</ADDRESS>
|
||||
</BODY>
|
||||
</HTML>
|
||||
448
cde/programs/nsgmls/doc/nsgmls.htm
Normal file
448
cde/programs/nsgmls/doc/nsgmls.htm
Normal file
@@ -0,0 +1,448 @@
|
||||
<!-- $XConsortium: nsgmls.htm /main/1 1996/09/22 18:16:58 rws $ -->
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<TITLE>NSGMLS</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
<H1>NSGMLS</H1>
|
||||
<H4>
|
||||
An SGML System Conforming to
|
||||
International Standard ISO 8879 --<BR>
|
||||
Standard Generalized Markup Language
|
||||
</H4>
|
||||
<H2>
|
||||
SYNOPSIS
|
||||
</H2>
|
||||
<P>
|
||||
<SAMP>nsgmls</SAMP>
|
||||
[
|
||||
<SAMP>-BCdeglprsuv</SAMP>
|
||||
]
|
||||
[
|
||||
<SAMP>-a<VAR>linktype</VAR></SAMP>
|
||||
]
|
||||
[
|
||||
<SAMP>-b<VAR>bctf</VAR></SAMP>
|
||||
]
|
||||
[
|
||||
<SAMP>-c<VAR>sysid</VAR></SAMP>
|
||||
]
|
||||
[
|
||||
<SAMP>-D<VAR>directory</VAR></SAMP>
|
||||
]
|
||||
[
|
||||
<SAMP>-E<VAR>max_errors</VAR></SAMP>
|
||||
]
|
||||
[
|
||||
<SAMP>-f<VAR>file</VAR></SAMP>
|
||||
]
|
||||
[
|
||||
<SAMP>-i<VAR>name</VAR></SAMP>
|
||||
]
|
||||
[
|
||||
<SAMP>-o<VAR>output_option</VAR></SAMP>
|
||||
]
|
||||
[
|
||||
<SAMP>-t<VAR>file</VAR></SAMP>
|
||||
]
|
||||
[
|
||||
<SAMP>-w<VAR>warning_type</VAR></SAMP>
|
||||
]
|
||||
[
|
||||
<SAMP><VAR>sysid</VAR>...</SAMP>
|
||||
]
|
||||
<H2>DESCRIPTION</H2>
|
||||
<P>
|
||||
Nsgmls parses and validates
|
||||
the SGML document whose document entity is specified by the
|
||||
<A HREF="sysid.htm">system identifiers</A>
|
||||
<SAMP><VAR>sysid</VAR>...</SAMP>
|
||||
and prints on the standard output a simple text representation of its
|
||||
Element Structure Information Set.
|
||||
(This is the information set which a structure-controlled
|
||||
conforming SGML application should act upon.)
|
||||
If more than one system identifier is specified,
|
||||
then the corresponding entities will be concatenated to form
|
||||
the document entity.
|
||||
Thus the document entity may be spread amongst several files;
|
||||
for example, the SGML declaration, prolog and document
|
||||
instance set could each be in a separate file.
|
||||
If no system identifiers are specified, then
|
||||
nsgmls
|
||||
will read the document entity from the standard input.
|
||||
A command line system identifier of
|
||||
<SAMP>-</SAMP>
|
||||
can be used to refer to the standard input.
|
||||
(Normally in a system identifier,
|
||||
<SAMP><osfd>0</SAMP>
|
||||
is used to refer to standard input.)
|
||||
<H2>OPTIONS</H2>
|
||||
<P>
|
||||
The following options are available:
|
||||
<DL>
|
||||
<DT>
|
||||
<SAMP>-a<VAR>linktype</VAR></SAMP>
|
||||
<DD>
|
||||
Make link type
|
||||
<SAMP><VAR>linktype</VAR></SAMP>
|
||||
active.
|
||||
Not all ESIS information is output in this case:
|
||||
the active LPDs are not explicitly reported,
|
||||
although each link attribute is qualified with
|
||||
its link type name;
|
||||
there is no information about result elements;
|
||||
when there are multiple link rules applicable to the
|
||||
current element,
|
||||
nsgmls
|
||||
always chooses the first.
|
||||
<DT>
|
||||
<SAMP>-b<VAR>bctf</VAR></SAMP>
|
||||
<DD>
|
||||
Use the <A HREF="sysid.htm#bctf">BCTF</A> named
|
||||
<SAMP><VAR>bctf</VAR></SAMP>
|
||||
for output.
|
||||
<DT>
|
||||
<SAMP>-B</SAMP>
|
||||
<DD>
|
||||
Batch mode.
|
||||
Parse each <SAMP><VAR>sysid...</VAR></SAMP> specified on the command
|
||||
line separately, rather than concatenating them.
|
||||
This is useful mainly with <SAMP>-s</SAMP>.
|
||||
<P>
|
||||
If <SAMP>-t<VAR>filename</VAR></SAMP> is also specified, then
|
||||
the specified <SAMP><VAR>filename</VAR></SAMP> will be prefixed
|
||||
to the <SAMP><VAR>sysid</VAR></SAMP> to make the filename
|
||||
for the RAST result for each <SAMP><VAR>sysid</VAR></SAMP>.
|
||||
<DT>
|
||||
<SAMP>-c<VAR>sysid</VAR></SAMP>
|
||||
<DD>
|
||||
Map public identifiers and entity names to system identifiers
|
||||
using the catalog entry file whose system identifier is
|
||||
<SAMP><VAR>sysid</VAR></SAMP>.
|
||||
Multiple
|
||||
<SAMP>-c</SAMP>
|
||||
options are allowed.
|
||||
If there is a catalog entry file called
|
||||
<SAMP>catalog</SAMP>
|
||||
in the same place as the document entity,
|
||||
it will be searched for immediately after those specified by
|
||||
<SAMP>-c</SAMP>.
|
||||
<DT>
|
||||
<A NAME="optC"><SAMP>-C</SAMP></A>
|
||||
<DD>
|
||||
The
|
||||
<SAMP><VAR>filename</VAR>...</SAMP>
|
||||
arguments specify catalog files rather than the document entity.
|
||||
The document entity is specified by the first
|
||||
<SAMP>DOCUMENT</SAMP>
|
||||
entry in the catalog files.
|
||||
<DT>
|
||||
<A NAME="optD"><SAMP>-D<VAR>directory</VAR></SAMP></A>
|
||||
<DD>
|
||||
Search
|
||||
<SAMP><VAR>directory</VAR></SAMP>
|
||||
for files specified in system identifiers.
|
||||
Multiple
|
||||
<SAMP>-D</SAMP> options
|
||||
are allowed.
|
||||
See the description of the
|
||||
<SAMP>osfile</SAMP>
|
||||
storage manager for more information about file searching.
|
||||
<DT>
|
||||
<SAMP>-e</SAMP>
|
||||
<DD>
|
||||
Describe open entities in error messages.
|
||||
Error messages always include the position of the most recently
|
||||
opened external entity.
|
||||
<DT>
|
||||
<SAMP>-E<VAR>max_errors</VAR></SAMP>
|
||||
<DD>
|
||||
Nsgmls
|
||||
will exit after
|
||||
<SAMP><VAR>max_errors</VAR></SAMP>
|
||||
errors.
|
||||
If
|
||||
<SAMP><VAR>max_errors</VAR></SAMP>
|
||||
is 0, there is no limit on the number of errors.
|
||||
The default is 200.
|
||||
<DT>
|
||||
<SAMP>-f<VAR>file</VAR></SAMP>
|
||||
<DD>
|
||||
Redirect errors to
|
||||
<SAMP><VAR>file</VAR></SAMP>.
|
||||
This is useful mainly with shells that do not support redirection
|
||||
of stderr.
|
||||
<DT>
|
||||
<SAMP>-g</SAMP>
|
||||
<DD>
|
||||
Show the generic identifiers of open elements in error messages.
|
||||
<DT>
|
||||
<A NAME="opti"><SAMP>-i<VAR>name</VAR></SAMP></A>
|
||||
<DD>
|
||||
Pretend that
|
||||
<PRE>
|
||||
<!ENTITY % <VAR>name</VAR> "INCLUDE">
|
||||
</PRE>
|
||||
<P>
|
||||
occurs at the start of the document type declaration subset
|
||||
in the SGML document entity.
|
||||
Since repeated definitions of an entity are ignored,
|
||||
this definition will take precedence over any other definitions
|
||||
of this entity in the document type declaration.
|
||||
Multiple
|
||||
<SAMP>-i</SAMP>
|
||||
options are allowed.
|
||||
If the SGML declaration replaces the reserved name
|
||||
<SAMP>INCLUDE</SAMP>
|
||||
then the new reserved name will be the replacement text of the entity.
|
||||
Typically the document type declaration will contain
|
||||
<PRE>
|
||||
<!ENTITY % <VAR>name</VAR> "IGNORE">
|
||||
</PRE>
|
||||
<P>
|
||||
and will use
|
||||
<SAMP>%<VAR>name</VAR>;</SAMP>
|
||||
in the status keyword specification of a marked section declaration.
|
||||
In this case the effect of the option will be to cause the marked
|
||||
section not to be ignored.
|
||||
<DT>
|
||||
<SAMP>-o<VAR>output_option</VAR></SAMP>
|
||||
<DD>
|
||||
Output additional information accordig to
|
||||
<SAMP><VAR>output_option</VAR></SAMP>:
|
||||
<DL>
|
||||
<DT>
|
||||
<SAMP>entity</SAMP>
|
||||
<DD>
|
||||
Output definitions of all general entities
|
||||
not just for data or subdoc entities that are referenced or named in an
|
||||
ENTITY or ENTITIES attribute.
|
||||
<DT>
|
||||
<SAMP>id</SAMP>
|
||||
<DD>
|
||||
Distinguish attributes whose declared value is ID.
|
||||
<DT>
|
||||
<SAMP>line</SAMP>
|
||||
<DD>
|
||||
Output
|
||||
<SAMP>L</SAMP>
|
||||
commands giving the current line number and filename.
|
||||
<DT>
|
||||
<SAMP>included</SAMP>
|
||||
<DD>
|
||||
Output an
|
||||
<SAMP>i</SAMP>
|
||||
command for included subelements.
|
||||
<DT>
|
||||
<SAMP>notation-sysid</SAMP>
|
||||
<DD>
|
||||
Output an <SAMP>f</SAMP> command before an <SAMP>N</SAMP> command,
|
||||
if a system identifier could be generated for that notation.
|
||||
</DL>
|
||||
<P>
|
||||
Multiple
|
||||
<SAMP>-o</SAMP>
|
||||
options are allowed.
|
||||
<DT>
|
||||
<SAMP>-p</SAMP>
|
||||
<DD>
|
||||
Parse only the prolog.
|
||||
Nsgmls
|
||||
will exit after parsing the document type declaration.
|
||||
Implies
|
||||
<SAMP>-s</SAMP>.
|
||||
<DT>
|
||||
<SAMP>-s</SAMP>
|
||||
<DD>
|
||||
Suppress output.
|
||||
Error messages will still be printed.
|
||||
<DT>
|
||||
<SAMP>-t<VAR>file</VAR></SAMP>
|
||||
<DD>
|
||||
Output to
|
||||
<SAMP><VAR>file</VAR></SAMP>
|
||||
the RAST result as defined by
|
||||
ISO/IEC 13673:1995 (actually this isn't quite an IS yet;
|
||||
this implements the Intermediate Editor's Draft of 1994/08/29,
|
||||
with changes to implement ISO/IEC JTC1/SC18/WG8 N1777).
|
||||
The normal output is not produced.
|
||||
<DT>
|
||||
<SAMP>-v</SAMP>
|
||||
<DD>
|
||||
Print the version number.
|
||||
<DT>
|
||||
<A NAME="optw"><SAMP>-w<VAR>type</VAR></SAMP></A>
|
||||
<DD>
|
||||
Control warnings and errors.
|
||||
Multiple
|
||||
<SAMP>-w</SAMP>
|
||||
options are allowed.
|
||||
The following values of
|
||||
<SAMP><VAR>type</VAR></SAMP>
|
||||
enable warnings:
|
||||
<DL>
|
||||
<DT>
|
||||
<SAMP>mixed</SAMP>
|
||||
<DD>
|
||||
Warn about mixed content models that do not allow #pcdata anywhere.
|
||||
<DT>
|
||||
<SAMP>sgmldecl</SAMP>
|
||||
<DD>
|
||||
Warn about various dubious constructions in the SGML declaration.
|
||||
<DT>
|
||||
<SAMP>should</SAMP>
|
||||
<DD>
|
||||
Warn about various recommendations made in ISO 8879 that the document
|
||||
does not comply with.
|
||||
(Recommendations are expressed with ``should'', as distinct from
|
||||
requirements which are usually expressed with ``shall''.)
|
||||
<DT>
|
||||
<SAMP>default</SAMP>
|
||||
<DD>
|
||||
Warn about defaulted references.
|
||||
<DT>
|
||||
<SAMP>duplicate</SAMP>
|
||||
<DD>
|
||||
Warn about duplicate entity declarations.
|
||||
<DT>
|
||||
<SAMP>undefined</SAMP>
|
||||
<DD>
|
||||
Warn about undefined elements: elements used in the DTD but not defined.
|
||||
<DT>
|
||||
<SAMP>unclosed</SAMP>
|
||||
<DD>
|
||||
Warn about unclosed start and end-tags.
|
||||
<DT>
|
||||
<SAMP>empty</SAMP>
|
||||
<DD>
|
||||
Warn about empty start and end-tags.
|
||||
<DT>
|
||||
<SAMP>net</SAMP>
|
||||
<DD>
|
||||
Warn about net-enabling start-tags and null end-tags.
|
||||
<DT>
|
||||
<SAMP>min-tag</SAMP>
|
||||
<DD>
|
||||
Warn about minimized start and end-tags.
|
||||
Equivalent to combination of
|
||||
<SAMP>unclosed</SAMP>,
|
||||
<SAMP>empty</SAMP>
|
||||
and
|
||||
<SAMP>net</SAMP>
|
||||
warnings.
|
||||
<DT>
|
||||
<SAMP>unused-map</SAMP>
|
||||
<DD>
|
||||
Warn about unused short reference maps: maps that are declared with a
|
||||
short reference mapping declaration but never used in a short
|
||||
reference use declaration in the DTD.
|
||||
<DT>
|
||||
<SAMP>unused-param</SAMP>
|
||||
<DD>
|
||||
Warn about parameter entities that are defined but not used in a DTD.
|
||||
Unused internal parameter entities whose text is
|
||||
<SAMP>INCLUDE</SAMP>
|
||||
or
|
||||
<SAMP>IGNORE</SAMP>
|
||||
won't get the warning.
|
||||
<DT>
|
||||
<SAMP>notation-sysid</SAMP>
|
||||
<DD>
|
||||
Warn about notations for which no system identifier could be generated.
|
||||
<DT>
|
||||
<SAMP>all</SAMP>
|
||||
<DD>
|
||||
Warn about conditions that should usually be avoided
|
||||
(in the opinion of the author).
|
||||
Equivalent to:
|
||||
<SAMP>mixed</SAMP>,
|
||||
<SAMP>should</SAMP>,
|
||||
<SAMP>default</SAMP>,
|
||||
<SAMP>undefined</SAMP>,
|
||||
<SAMP>sgmldecl</SAMP>,
|
||||
<SAMP>unused-map</SAMP>,
|
||||
<SAMP>unused-param</SAMP>,
|
||||
<SAMP>empty</SAMP>
|
||||
and
|
||||
<SAMP>unclosed</SAMP>.
|
||||
</DL>
|
||||
<P>
|
||||
A warning can be disabled by using its name prefixed with
|
||||
<SAMP>no-</SAMP>.
|
||||
Thus
|
||||
<SAMP>-wall -wno-duplicate</SAMP>
|
||||
will enable all warnings except those about duplicate entity
|
||||
declarations.
|
||||
<P>
|
||||
The following values for
|
||||
<SAMP><VAR>warning_type</VAR></SAMP>
|
||||
disable errors:
|
||||
<DL>
|
||||
<DT>
|
||||
<SAMP>no-idref</SAMP>
|
||||
<DD>
|
||||
Do not give an error for an ID reference value
|
||||
which no element has as its ID.
|
||||
The effect will be as if each attribute declared as
|
||||
an ID reference value had been declared as a name.
|
||||
<DT>
|
||||
<SAMP>no-significant</SAMP>
|
||||
<DD>
|
||||
Do not give an error when a character that is not a significant
|
||||
character in the reference concrete syntax occurs in a literal in the
|
||||
SGML declaration. This may be useful in conjunction with certain
|
||||
buggy test suites.
|
||||
</DL>
|
||||
</DL>
|
||||
<P>
|
||||
The following options are also supported for backwards compatibility
|
||||
with sgmls:
|
||||
<DL>
|
||||
<DT>
|
||||
<SAMP>-d</SAMP>
|
||||
<DD>
|
||||
Same as
|
||||
<SAMP>-wduplicate</SAMP>.
|
||||
<DT>
|
||||
<SAMP>-l</SAMP>
|
||||
<DD>
|
||||
Same as
|
||||
<SAMP>-oline</SAMP>.
|
||||
<DT>
|
||||
<SAMP>-m<VAR>sysid</VAR></SAMP>
|
||||
<DD>
|
||||
Same as <SAMP>-c</SAMP>.
|
||||
<DT>
|
||||
<SAMP>-r</SAMP>
|
||||
<DD>
|
||||
Same as
|
||||
<SAMP>-wdefault</SAMP>.
|
||||
<DT>
|
||||
<SAMP>-u</SAMP>
|
||||
<DD>
|
||||
Same as
|
||||
<SAMP>-wundef</SAMP>.
|
||||
</DL>
|
||||
<H2>ENVIRONMENT</H2>
|
||||
<DL>
|
||||
<DT>
|
||||
<SAMP>SP_BCTF</SAMP>
|
||||
<DD>
|
||||
If this is set to one of
|
||||
<SAMP>identity</SAMP>,
|
||||
<SAMP>utf-8</SAMP>,
|
||||
<SAMP>euc-jp</SAMP> and <SAMP>sjis</SAMP>, then that BCTF will be used as the
|
||||
default BCTF for everything (including file input, file output,
|
||||
message output, filenames, environment variable names, environment
|
||||
variable values and command line arguments). Note that setting
|
||||
<SAMP>SP_BCTF</SAMP> to <SAMP>unicode</SAMP>
|
||||
will not work.
|
||||
</DL>
|
||||
<P>
|
||||
<ADDRESS>
|
||||
James Clark<BR>
|
||||
jjc@jclark.com
|
||||
</ADDRESS>
|
||||
</BODY>
|
||||
</HTML>
|
||||
273
cde/programs/nsgmls/doc/sgmldecl.htm
Normal file
273
cde/programs/nsgmls/doc/sgmldecl.htm
Normal file
@@ -0,0 +1,273 @@
|
||||
<!-- $XConsortium: sgmldecl.htm /main/1 1996/09/22 18:17:17 rws $ -->
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<TITLE>SP - SGML declaration</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
<H1>Handling of the SGML declaration in SP</H1>
|
||||
<H2>Default SGML declaration</H2>
|
||||
<P>
|
||||
If the SGML declaration is omitted
|
||||
and there is no applicable
|
||||
<A HREF="catalog.htm#sgmldecl"><SAMP>SGMLDECL</SAMP></A>
|
||||
entry in a catalog,
|
||||
the following declaration will be implied:
|
||||
<PRE>
|
||||
<!SGML "ISO 8879:1986"
|
||||
CHARSET
|
||||
BASESET "ISO 646-1983//CHARSET
|
||||
International Reference Version (IRV)//ESC 2/5 4/0"
|
||||
DESCSET 0 9 UNUSED
|
||||
9 2 9
|
||||
11 2 UNUSED
|
||||
13 1 13
|
||||
14 18 UNUSED
|
||||
32 95 32
|
||||
127 1 UNUSED
|
||||
CAPACITY PUBLIC "ISO 8879:1986//CAPACITY Reference//EN"
|
||||
SCOPE DOCUMENT
|
||||
SYNTAX
|
||||
SHUNCHAR CONTROLS 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
|
||||
18 19 20 21 22 23 24 25 26 27 28 29 30 31 127 255
|
||||
BASESET "ISO 646-1983//CHARSET International Reference Version
|
||||
(IRV)//ESC 2/5 4/0"
|
||||
DESCSET 0 128 0
|
||||
FUNCTION RE 13
|
||||
RS 10
|
||||
SPACE 32
|
||||
TAB SEPCHAR 9
|
||||
NAMING LCNMSTRT ""
|
||||
UCNMSTRT ""
|
||||
LCNMCHAR "-."
|
||||
UCNMCHAR "-."
|
||||
NAMECASE GENERAL YES
|
||||
ENTITY NO
|
||||
DELIM GENERAL SGMLREF
|
||||
SHORTREF SGMLREF
|
||||
NAMES SGMLREF
|
||||
QUANTITY SGMLREF
|
||||
ATTCNT 99999999
|
||||
ATTSPLEN 99999999
|
||||
DTEMPLEN 24000
|
||||
ENTLVL 99999999
|
||||
GRPCNT 99999999
|
||||
GRPGTCNT 99999999
|
||||
GRPLVL 99999999
|
||||
LITLEN 24000
|
||||
NAMELEN 99999999
|
||||
PILEN 24000
|
||||
TAGLEN 99999999
|
||||
TAGLVL 99999999
|
||||
FEATURES
|
||||
MINIMIZE DATATAG NO
|
||||
OMITTAG YES
|
||||
RANK YES
|
||||
SHORTTAG YES
|
||||
LINK SIMPLE YES 1000
|
||||
IMPLICIT YES
|
||||
EXPLICIT YES 1
|
||||
OTHER CONCUR NO
|
||||
SUBDOC YES 99999999
|
||||
FORMAL YES
|
||||
APPINFO NONE>
|
||||
</PRE>
|
||||
<P>
|
||||
with the exception that all characters that are neither significant
|
||||
nor shunned will be assigned to DATACHAR.
|
||||
<H2>Character sets</H2>
|
||||
<P>
|
||||
A character in a base character set is described either by giving its
|
||||
number in a universal character set, or by specifying a minimum
|
||||
literal. The constraints on the choice of universal character set are
|
||||
that characters that are significant in the SGML reference concrete
|
||||
syntax must be in the universal character set and must have the same
|
||||
number in the universal character set as in ISO 646 and that each
|
||||
character in the character set must be represented by exactly one
|
||||
number; that character numbers in the range 0 to 31 and 127 to 159 are
|
||||
control characters (for the purpose of enforcing SHUNCHAR CONTROLS).
|
||||
It is recommended that ISO 10646 (Unicode) be used as the universal
|
||||
character set, except in environments where the normal document
|
||||
character sets are large character set which cannot be compactly
|
||||
described in terms of ISO 10646.
|
||||
The public identifier of a base character set can be associated
|
||||
with an entity that describes it by using a
|
||||
<SAMP>PUBLIC</SAMP>
|
||||
entry in the catalog entry file.
|
||||
The entity must be a fragment
|
||||
of an SGML declaration
|
||||
consisting of the
|
||||
portion of a character set description,
|
||||
following the DESCSET keyword,
|
||||
that is, it must be a sequence of character descriptions,
|
||||
where each character description specifies a described character
|
||||
number, the number of characters and
|
||||
either a character number in the universal character set, a minimum literal
|
||||
or the keyword
|
||||
<SAMP>UNUSED</SAMP>.
|
||||
Character numbers in the universal character set can be as big as
|
||||
99999999.
|
||||
<P>
|
||||
In addition SP has built in knowledge of a few character sets.
|
||||
These are identified using the designating sequence in the
|
||||
public identifier. The following designating sequences are
|
||||
recognized:
|
||||
<DL>
|
||||
<DT>
|
||||
<SAMP>ESC 2/5 4/0</SAMP>
|
||||
<DD>
|
||||
The full set of ISO 646 IRV.
|
||||
This is not a registered character set,
|
||||
but is recommended by ISO 8879 (clause 10.2.2.4).
|
||||
<DT>
|
||||
<SAMP>ESC 2/8 4/0</SAMP>
|
||||
<DD>
|
||||
G0 set of ISO 646 IRV,
|
||||
ISO Registration Number 2.
|
||||
<DT>
|
||||
<SAMP>ESC 2/8 4/2</SAMP>
|
||||
<DD>
|
||||
G0 set of ASCII,
|
||||
ISO Registration Number 6.
|
||||
<DT>
|
||||
<SAMP>ESC 2/1 4/0</SAMP>
|
||||
<DD>
|
||||
C0 set of ISO 646,
|
||||
ISO Registration Number 1.
|
||||
</DL>
|
||||
<P>
|
||||
All the above character sets will be treated as mapping character numbers
|
||||
0 to 127 inclusive as in ISO 646.
|
||||
<P>
|
||||
It is not necessary for every character set used in the SGML
|
||||
declaration to be known to SP
|
||||
provided that characters in the document character set that are
|
||||
significant both in the reference concrete syntax and in the described
|
||||
concrete syntax are described using known base character sets and that
|
||||
characters that are significant in the described concrete syntax are
|
||||
described using the same base character sets or the same minimum
|
||||
literals in both the document character set description and the syntax
|
||||
reference character set description.
|
||||
|
||||
<H2>Concrete syntaxes</H2>
|
||||
<P>
|
||||
The public identifier for a public concrete syntax can be associated
|
||||
with an entity that describes using a
|
||||
<SAMP>PUBLIC</SAMP>
|
||||
entry in the catalog entry file.
|
||||
The entity must be a fragment of an SGML declaration
|
||||
consisting of a concrete syntax description
|
||||
starting with the
|
||||
<SAMP>SHUNCHAR</SAMP>
|
||||
keyword
|
||||
as in an SGML declaration.
|
||||
The entity can also make use of the following extensions:
|
||||
<UL>
|
||||
<LI>
|
||||
An
|
||||
<I>added function</I>
|
||||
can be expressed as a parameter literal
|
||||
instead of a name.
|
||||
<LI>
|
||||
The replacement for a reference reserved name
|
||||
can be expressed as a parameter literal instead of a name.
|
||||
<LI>
|
||||
The
|
||||
<SAMP>LCNMSTRT</SAMP>,
|
||||
<SAMP>UCNMSTRT</SAMP>,
|
||||
<SAMP>LCNMCHAR</SAMP>
|
||||
and
|
||||
<SAMP>UCNMCHAR</SAMP>
|
||||
keywords may each be followed by more than one parameter literal. A
|
||||
sequence of parameter literals has the same meaning as a single
|
||||
parameter literal whose content is the concatenation of the content of
|
||||
each of the literals in the sequence. This extension is useful
|
||||
because of the restriction on the length of a parameter literal in the
|
||||
SGML declaration to 240 characters.
|
||||
<LI>
|
||||
The total number of characters specified for
|
||||
<SAMP>UCNMCHAR</SAMP>
|
||||
or
|
||||
<SAMP>UCNMSTRT</SAMP>
|
||||
may exceed the total number of characters specified for
|
||||
<SAMP>LCNMCHAR</SAMP>
|
||||
or
|
||||
<SAMP>LCNMSTRT</SAMP>
|
||||
respectively.
|
||||
Each character in
|
||||
<SAMP>UCNMCHAR</SAMP>
|
||||
or
|
||||
<SAMP>UCNMSTRT</SAMP>
|
||||
which does not have a corresponding character in the same position in
|
||||
<SAMP>LCNMCHAR</SAMP>
|
||||
or
|
||||
<SAMP>LCNMSTRT</SAMP>
|
||||
is simply assigned to <SAMP>UCNMCHAR</SAMP> or <SAMP>UCNMSTRT</SAMP>
|
||||
without making it the upper-case form of any character.
|
||||
<LI>
|
||||
A parameter following any of
|
||||
<SAMP>LCNMSTRT</SAMP>,
|
||||
<SAMP>UCNMSTRT</SAMP>,
|
||||
<SAMP>LCNMCHAR</SAMP>
|
||||
and
|
||||
<SAMP>UCNMCHAR</SAMP>
|
||||
keywords may be followed by
|
||||
the name token <SAMP>...</SAMP>
|
||||
(three periods) and another parameter literal.
|
||||
This has the same meaning as the two parameter literals
|
||||
with a parameter literal in between
|
||||
containing in order each character whose number
|
||||
is greater than the number of the last character in
|
||||
the first parameter literal and less than the
|
||||
number of the first character in the second
|
||||
parameter literal.
|
||||
A parameter literal must contain at least one character for each
|
||||
<SAMP>...</SAMP>
|
||||
to which it is adjacent.
|
||||
<LI>
|
||||
A number may be used as a parameter following the
|
||||
<SAMP>LCNMSTRT</SAMP>,
|
||||
<SAMP>UCNMSTRT</SAMP>,
|
||||
<SAMP>LCNMCHAR</SAMP>
|
||||
and
|
||||
<SAMP>UCNMCHAR</SAMP>
|
||||
keywords or as a delimiter in the
|
||||
<SAMP>DELIM</SAMP>
|
||||
section with the same meaning as a parameter literal
|
||||
containing just a numeric character reference with that number.
|
||||
<LI>
|
||||
The parameters following the
|
||||
<SAMP>LCNMSTRT</SAMP>,
|
||||
<SAMP>UCNMSTRT</SAMP>,
|
||||
<SAMP>LCNMCHAR</SAMP>
|
||||
and
|
||||
<SAMP>UCNMCHAR</SAMP>
|
||||
keywords may be omitted.
|
||||
This has the same meaning as specifying
|
||||
an empty parameter literal.
|
||||
<LI>
|
||||
Within the specification of the short reference delimiters,
|
||||
a parameter literal containing exactly one character
|
||||
may be followed by the name token <SAMP>...</SAMP>
|
||||
and another parameter literal containing exactly one character.
|
||||
This has the same meaning as a sequence of parameter literals
|
||||
one for each character number that is greater than or equal
|
||||
to the number of the character in the first parameter literal
|
||||
and less than or equal to the number of the character in the
|
||||
second parameter literal.
|
||||
</UL>
|
||||
<H2>Capacity sets</H2>
|
||||
<P>
|
||||
The public identifier for a public capacity set can be associated
|
||||
with an entity that describes using a
|
||||
<SAMP>PUBLIC</SAMP>
|
||||
entry in the catalog entry file.
|
||||
The entity must be a fragment of an SGML declaration
|
||||
consisting of a sequence of capacity names and numbers.
|
||||
<P>
|
||||
<ADDRESS>
|
||||
James Clark<BR>
|
||||
jjc@jclark.com
|
||||
</ADDRESS>
|
||||
</BODY>
|
||||
</HTML>
|
||||
151
cde/programs/nsgmls/doc/sgmlnorm.htm
Normal file
151
cde/programs/nsgmls/doc/sgmlnorm.htm
Normal file
@@ -0,0 +1,151 @@
|
||||
<!-- $XConsortium: sgmlnorm.htm /main/1 1996/09/22 18:17:36 rws $ -->
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<TITLE>SGMLNORM</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
<H1>SGMLNORM</H1>
|
||||
<H4>
|
||||
An SGML System Conforming to
|
||||
International Standard ISO 8879 --<BR>
|
||||
Standard Generalized Markup Language
|
||||
</H4>
|
||||
<H2>SYNOPSIS</H2>
|
||||
<P>
|
||||
<SAMP>sgmlnorm</SAMP>
|
||||
[
|
||||
<SAMP>-Cdemnv</SAMP>
|
||||
]
|
||||
[
|
||||
<SAMP>-b<VAR>bctf</VAR></SAMP>
|
||||
]
|
||||
[
|
||||
<SAMP>-c<VAR>catalog</VAR></SAMP>
|
||||
]
|
||||
[
|
||||
<SAMP>-D<VAR>dir</VAR></SAMP>
|
||||
]
|
||||
[
|
||||
<SAMP>-i<VAR>name</VAR></SAMP>
|
||||
]
|
||||
[
|
||||
<SAMP>-w<VAR>warning</VAR></SAMP>
|
||||
]
|
||||
<SAMP><VAR>sysid...</VAR></SAMP>
|
||||
|
||||
<H2>DESCRIPTION</H2>
|
||||
<P>
|
||||
Sgmlnorm prints on the standard output a <I>normalized</I> document instance
|
||||
for the SGML document contained in the concatenation of the entities
|
||||
with <A HREF="sysid.htm">system identifiers</A>
|
||||
<SAMP><VAR>sysid...</VAR></SAMP>.
|
||||
<P>
|
||||
When the normalized instance is prefixed with the original SGML declaration
|
||||
and prolog, it will have the same ESIS as the original SGML document,
|
||||
with the following exceptions:
|
||||
<UL>
|
||||
<LI>
|
||||
The output of sgmlnorm does not protect against the recognition of
|
||||
short reference delimiters, so any <SAMP>USEMAP</SAMP> declarations
|
||||
must be removed from the DTD.
|
||||
<LI>
|
||||
The normalized instance will use the reference delimiters, even if the
|
||||
original instance did not.
|
||||
<LI>
|
||||
If marked sections are included in the output using the
|
||||
<SAMP>-m</SAMP> option, the reference reserved names will be used for
|
||||
the status keywords even if the original instance did not.
|
||||
<LI>
|
||||
Any ESIS information relating to the SGML LINK feature will be lost.
|
||||
</UL>
|
||||
<P>
|
||||
The normalized instance will not use any markup minimization features
|
||||
except that:
|
||||
<UL>
|
||||
<LI>
|
||||
Any attributes that were not specified in the original instance
|
||||
will not be included in the normalized instance.
|
||||
(Current attributes will be included.)
|
||||
<LI>
|
||||
If the declared value of an attribute was a name token group,
|
||||
and a value was specified that was the same as the name of
|
||||
the attribute, then the attribute name and value indicator will be
|
||||
omitted.
|
||||
For example, with HTML sgmlnorm would output <CODE><DL COMPACT></CODE>
|
||||
rather than <CODE><DL COMPACT="COMPACT"></CODE>
|
||||
</UL>
|
||||
<P>
|
||||
The following options are available:
|
||||
<DL>
|
||||
<DT>
|
||||
<SAMP>-b<VAR>bctf</VAR></SAMP>
|
||||
<DD>
|
||||
Use the <A HREF="sysid.htm#bctf">BCTF</A> with name
|
||||
<SAMP><VAR>bctf</VAR></SAMP>
|
||||
for output.
|
||||
<DT>
|
||||
<SAMP>-c<VAR>file</VAR></SAMP>
|
||||
<DD>
|
||||
Use the catalog entry file
|
||||
<SAMP><VAR>file</VAR></SAMP>.
|
||||
<DT>
|
||||
<SAMP>-C</SAMP>
|
||||
<DD>
|
||||
This has the same effect as in <A HREF="nsgmls#optC">nsgmls</A>.
|
||||
<DT>
|
||||
<SAMP>-d</SAMP>
|
||||
<DD>
|
||||
Output a document type declaration with the same external
|
||||
identifier as the input document, and with no
|
||||
internal declaration subset.
|
||||
No check is performed that the document instance is valid
|
||||
with respect to this DTD.
|
||||
<DT>
|
||||
<SAMP>-D<VAR>directory</VAR></SAMP>
|
||||
<DD>
|
||||
Search
|
||||
<SAMP><VAR>directory</VAR></SAMP>
|
||||
for files specified in system identifiers.
|
||||
This has the same effect as in <A HREF="nsgmls.htm#optD">nsgmls</A>.
|
||||
<DT>
|
||||
<SAMP>-e</SAMP>
|
||||
<DD>
|
||||
Describe open entities in error messages.
|
||||
<DT>
|
||||
<SAMP>-i<VAR>name</VAR></SAMP>
|
||||
<DD>
|
||||
This has the same effect as in <A HREF="nsgmls.htm#opti">nsgmls</A>.
|
||||
<DT>
|
||||
<SAMP>-m</SAMP>
|
||||
<DD>
|
||||
Output any marked sections that were in the input document instance.
|
||||
<DT>
|
||||
<SAMP>-n</SAMP>
|
||||
<DD>
|
||||
Output any comments that were in the input document instance.
|
||||
<DT>
|
||||
<SAMP>-r</SAMP>
|
||||
<DD>
|
||||
Raw output.
|
||||
Don't perform any conversion on RSs and REs when printing the entity.
|
||||
The entity would typically have the storage manager attribute
|
||||
<SAMP>records=asis</SAMP>.
|
||||
<DT>
|
||||
<SAMP>-v</SAMP>
|
||||
<DD>
|
||||
Print the version number.
|
||||
<DT>
|
||||
<SAMP>-w<VAR>type</VAR></SAMP>
|
||||
<DD>
|
||||
Control warnings and errors according to
|
||||
<SAMP><VAR>type</VAR></SAMP>.
|
||||
This has the same effect as in <A HREF="nsgmls.htm#optw">nsgmls</A>.
|
||||
</DL>
|
||||
<P>
|
||||
<ADDRESS>
|
||||
James Clark<BR>
|
||||
jjc@jclark.com
|
||||
</ADDRESS>
|
||||
</BODY>
|
||||
</HTML>
|
||||
418
cde/programs/nsgmls/doc/sgmlsout.htm
Normal file
418
cde/programs/nsgmls/doc/sgmlsout.htm
Normal file
@@ -0,0 +1,418 @@
|
||||
<!-- $XConsortium: sgmlsout.htm /main/1 1996/09/22 18:17:55 rws $ -->
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<TITLE>Nsgmls Output Format</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
<H1>Nsgmls Output Format</H1>
|
||||
<P>
|
||||
The output is a series of lines.
|
||||
Lines can be arbitrarily long.
|
||||
Each line consists of an initial command character
|
||||
and one or more arguments.
|
||||
Arguments are separated by a single space,
|
||||
but when a command takes a fixed number of arguments
|
||||
the last argument can contain spaces.
|
||||
There is no space between the command character and the first argument.
|
||||
Arguments can contain the following escape sequences:
|
||||
<DL>
|
||||
<DT>
|
||||
<CODE>\\</CODE>
|
||||
<DD>
|
||||
A
|
||||
<CODE>\</CODE>.
|
||||
<DT>
|
||||
<CODE>\n</CODE>
|
||||
<DD>
|
||||
A record end character.
|
||||
<DT>
|
||||
<CODE>\|</CODE>
|
||||
<DD>
|
||||
Internal SDATA entities are bracketed by these.
|
||||
<DT>
|
||||
<CODE>\<VAR>nnn</VAR></CODE>
|
||||
<DD>
|
||||
The character whose code is
|
||||
<CODE><VAR>nnn</VAR></CODE>
|
||||
octal.
|
||||
<P>
|
||||
A record start character will be represented by
|
||||
<CODE>\012</CODE>.
|
||||
Most applications will need to ignore
|
||||
<CODE>\012</CODE>
|
||||
and translate
|
||||
<CODE>\n</CODE>
|
||||
into newline.
|
||||
<DT>
|
||||
<CODE>\#<VAR>n</VAR>;</CODE>
|
||||
<DD>
|
||||
The character whose number is
|
||||
<CODE><VAR>n</VAR></CODE>
|
||||
in decimal.
|
||||
<CODE><VAR>n</VAR></CODE>
|
||||
can have any number of digits.
|
||||
This is used for characters that are not representable by the
|
||||
encoding translation used for output
|
||||
(as specified by the
|
||||
<CODE>SP_BCTF</CODE>
|
||||
environment variable).
|
||||
This will only occur with the multibyte version of nsgmls.
|
||||
</DL>
|
||||
<P>
|
||||
The possible command characters and arguments are as follows:
|
||||
<DL>
|
||||
<DT>
|
||||
<CODE>(<VAR>gi</VAR></CODE>
|
||||
<DD>
|
||||
The start of an element whose generic identifier is
|
||||
<CODE><VAR>gi</VAR></CODE>.
|
||||
Any attributes for this element
|
||||
will have been specified with
|
||||
<CODE>A</CODE>
|
||||
commands.
|
||||
<DT>
|
||||
<CODE>)<VAR>gi</VAR></CODE>
|
||||
<DD>
|
||||
The end of an element whose generic identifier is
|
||||
<CODE><VAR>gi</VAR></CODE>.
|
||||
<DT>
|
||||
<CODE>-<VAR>data</VAR></CODE>
|
||||
<DD>
|
||||
Data.
|
||||
<DT>
|
||||
<CODE>&<VAR>name</VAR></CODE>
|
||||
<DD>
|
||||
A reference to an external data entity
|
||||
<CODE><VAR>name</VAR></CODE>;
|
||||
<CODE><VAR>name</VAR></CODE>
|
||||
will have been defined using an
|
||||
<CODE>E</CODE>
|
||||
command.
|
||||
<DT>
|
||||
<CODE>?<VAR>pi</VAR></CODE>
|
||||
<DD>
|
||||
A processing instruction with data
|
||||
<CODE><VAR>pi</VAR></CODE>.
|
||||
<DT>
|
||||
<CODE>A<VAR>name</VAR> <VAR>val</VAR></CODE>
|
||||
<DD>
|
||||
The next element to start has an attribute
|
||||
<CODE><VAR>name</VAR></CODE>
|
||||
with value
|
||||
<CODE><VAR>val</VAR></CODE>
|
||||
which takes one of the following forms:
|
||||
<DL>
|
||||
<DT>
|
||||
<CODE>IMPLIED</CODE>
|
||||
<DD>
|
||||
The value of the attribute is implied.
|
||||
<DT>
|
||||
<CODE>CDATA <VAR>data</VAR></CODE>
|
||||
<DD>
|
||||
The attribute is character data.
|
||||
This is used for attributes whose declared value is
|
||||
<CODE>CDATA</CODE>.
|
||||
<DT>
|
||||
<CODE>NOTATION <VAR>nname</VAR></CODE>
|
||||
<DD>
|
||||
The attribute is a notation name;
|
||||
<CODE><VAR>nname</VAR></CODE>
|
||||
will have been defined using a
|
||||
<CODE>N</CODE>
|
||||
command.
|
||||
This is used for attributes whose declared value is
|
||||
<CODE>NOTATION</CODE>.
|
||||
<DT>
|
||||
<CODE>ENTITY <VAR>name...</VAR></CODE>
|
||||
<DD>
|
||||
The attribute is a list of general entity names.
|
||||
Each entity name will have been defined using an
|
||||
<CODE>I</CODE>,
|
||||
<CODE>E</CODE>
|
||||
or
|
||||
<CODE>S</CODE>
|
||||
command.
|
||||
This is used for attributes whose declared value is
|
||||
<CODE>ENTITY</CODE>
|
||||
or
|
||||
<CODE>ENTITIES</CODE>.
|
||||
<DT>
|
||||
<CODE>TOKEN <VAR>token...</VAR></CODE>
|
||||
<DD>
|
||||
The attribute is a list of tokens.
|
||||
This is used for attributes whose declared value is anything else.
|
||||
<DT>
|
||||
<CODE>ID <VAR>token</VAR></CODE>
|
||||
<DD>
|
||||
The attribute is an ID value.
|
||||
This will be output only if the
|
||||
<CODE>-oid</CODE>
|
||||
option is specified.
|
||||
Otherwise
|
||||
<CODE>TOKEN</CODE>
|
||||
will be used for ID values.
|
||||
</DL>
|
||||
<DT>
|
||||
<CODE>D<VAR>ename</VAR> <VAR>name</VAR> <VAR>val</VAR></CODE>
|
||||
<DD>
|
||||
This is the same as the
|
||||
<CODE>A</CODE>
|
||||
command, except that it specifies a data attribute for an
|
||||
external entity named
|
||||
<CODE><VAR>ename</VAR></CODE>.
|
||||
Any
|
||||
<CODE>D</CODE>
|
||||
commands will come after the
|
||||
<CODE>E</CODE>
|
||||
command that defines the entity to which they apply, but
|
||||
before any
|
||||
<CODE>&</CODE>
|
||||
or
|
||||
<CODE>A</CODE>
|
||||
commands that reference the entity.
|
||||
<DT>
|
||||
<CODE>a<VAR>type</VAR> <VAR>name</VAR> <VAR>val</VAR></CODE>
|
||||
<DD>
|
||||
The next element to start has a link attribute with link type
|
||||
<CODE><VAR>type</VAR></CODE>,
|
||||
name
|
||||
<CODE><VAR>name</VAR></CODE>,
|
||||
and value
|
||||
<CODE><VAR>val</VAR></CODE>,
|
||||
which takes the same form as with the
|
||||
<CODE>A</CODE>
|
||||
command.
|
||||
<DT>
|
||||
<CODE>N<VAR>nname</VAR></CODE>
|
||||
<DD>
|
||||
Define a notation <CODE><VAR>nname</VAR></CODE>.
|
||||
This command will be preceded by a
|
||||
<CODE>p</CODE>
|
||||
command if the notation was declared with a public identifier,
|
||||
and by a
|
||||
<CODE>s</CODE>
|
||||
command if the notation was declared with a system identifier.
|
||||
If the
|
||||
<CODE>-onotation-sysid</CODE>
|
||||
option was specified,
|
||||
this command will also be preceded by an
|
||||
<CODE>f</CODE>
|
||||
command giving the system identifier generated by the entity manager
|
||||
(unless it was unable to generate one).
|
||||
A notation will only be defined if it is to be referenced
|
||||
in an
|
||||
<CODE>E</CODE>
|
||||
command or in an
|
||||
<CODE>A</CODE>
|
||||
command for an attribute with a declared value of
|
||||
<CODE>NOTATION</CODE>.
|
||||
<DT>
|
||||
<CODE>E<VAR>ename</VAR> <VAR>typ</VAR> <VAR>nname</VAR></CODE>
|
||||
<DD>
|
||||
Define an external data entity named
|
||||
<CODE><VAR>ename</VAR></CODE>
|
||||
with type
|
||||
<CODE><VAR>typ</VAR></CODE>
|
||||
(<CODE>CDATA</CODE>, <CODE>NDATA</CODE> or <CODE>SDATA</CODE>)
|
||||
and notation <CODE><VAR>not</VAR></CODE>.
|
||||
Thiscommand will be preceded by an
|
||||
<CODE>f</CODE>
|
||||
command giving the system identifier generated by the entity manager
|
||||
(unless it was unable to generate one),
|
||||
by a
|
||||
<CODE>p</CODE>
|
||||
command if a public identifier was declared for the entity,
|
||||
and by a
|
||||
<CODE>s</CODE>
|
||||
command if a system identifier was declared for the entity.
|
||||
<CODE><VAR>not</VAR></CODE>
|
||||
will have been defined using a
|
||||
<CODE>N</CODE>
|
||||
command.
|
||||
Data attributes may be specified for the entity using
|
||||
<CODE>D</CODE>
|
||||
commands.
|
||||
If the
|
||||
<CODE>-oentity</CODE>
|
||||
option is not specified,
|
||||
an external data entity will only be defined if it is to be referenced in a
|
||||
<CODE>&</CODE>
|
||||
command or in an
|
||||
<CODE>A</CODE>
|
||||
command for an attribute whose declared value is
|
||||
<CODE>ENTITY</CODE>
|
||||
or
|
||||
<CODE>ENTITIES</CODE>.
|
||||
<DT>
|
||||
<CODE>I<VAR>ename</VAR> <VAR>typ</VAR> <VAR>text</VAR></CODE>
|
||||
<DD>
|
||||
Define an internal data entity named
|
||||
<CODE><VAR>ename</VAR></CODE>
|
||||
with type
|
||||
<CODE><VAR>typ</VAR></CODE>
|
||||
and entity text
|
||||
<CODE><VAR>text</VAR></CODE>.
|
||||
The
|
||||
<CODE><VAR>typ</VAR></CODE>
|
||||
will be
|
||||
<CODE>CDATA</CODE>
|
||||
or
|
||||
<CODE>SDATA</CODE>
|
||||
unless the
|
||||
<CODE>-oentity</CODE>
|
||||
option was specified,
|
||||
in which case it can also be
|
||||
<CODE>PI</CODE>
|
||||
or
|
||||
<CODE>TEXT</CODE>
|
||||
(for an SGML text entity).
|
||||
If the
|
||||
<CODE>-oentity</CODE>
|
||||
option is not specified,
|
||||
an internal data entity will only be defined if it is referenced in an
|
||||
<CODE>A</CODE>
|
||||
command for an attribute whose declared value is
|
||||
<CODE>ENTITY</CODE>
|
||||
or
|
||||
<CODE>ENTITIES</CODE>.
|
||||
<DT>
|
||||
<CODE>S<VAR>ename</VAR></CODE>
|
||||
<DD>
|
||||
Define a subdocument entity named
|
||||
<CODE><VAR>ename</VAR></CODE>.
|
||||
This command will be preceded by an
|
||||
<CODE>f</CODE>
|
||||
command giving the system identifier generated by the entity manager
|
||||
(unless it was unable to generate one),
|
||||
by a
|
||||
<CODE>p</CODE>
|
||||
command if a public identifier was declared for the entity,
|
||||
and by a
|
||||
<CODE>s</CODE>
|
||||
command if a system identifier was declared for the entity.
|
||||
If the
|
||||
<CODE>-oentity</CODE>
|
||||
option is not specified,
|
||||
a subdocument entity will only be defined if it is referenced
|
||||
in a
|
||||
<CODE>{</CODE>
|
||||
command
|
||||
or in an
|
||||
<CODE>A</CODE>
|
||||
command for an attribute whose declared value is
|
||||
<CODE>ENTITY</CODE>
|
||||
or
|
||||
<CODE>ENTITIES</CODE>.
|
||||
<DT>
|
||||
<CODE>T<VAR>ename</VAR></CODE>
|
||||
<DD>
|
||||
Define an external SGML text entity named
|
||||
<CODE><VAR>ename</VAR></CODE>.
|
||||
This command will be preceded by an
|
||||
<CODE>f</CODE>
|
||||
command giving the system identifier generated by the entity manager
|
||||
(unless it was unable to generate one),
|
||||
by a
|
||||
<CODE>p</CODE>
|
||||
command if a public identifier was declared for the entity,
|
||||
and by a
|
||||
<CODE>s</CODE>
|
||||
command if a system identifier was declared for the entity.
|
||||
This command will be output only if the
|
||||
<CODE>-oentity</CODE>
|
||||
option is specified.
|
||||
<DT>
|
||||
<CODE>s<VAR>sysid</VAR></CODE>
|
||||
<DD>
|
||||
This command applies to the next
|
||||
<CODE>E</CODE>,
|
||||
<CODE>S</CODE>,
|
||||
<CODE>T</CODE>
|
||||
or
|
||||
<CODE>N</CODE>
|
||||
command and specifies the associated system identifier.
|
||||
<DT>
|
||||
<CODE>p<VAR>pubid</VAR></CODE>
|
||||
<DD>
|
||||
This command applies to the next
|
||||
<CODE>E</CODE>,
|
||||
<CODE>S</CODE>,
|
||||
<CODE>T</CODE>
|
||||
or
|
||||
<CODE>N</CODE>
|
||||
command and specifies the associated public identifier.
|
||||
<DT>
|
||||
<CODE>f<VAR>sysid</VAR></CODE>
|
||||
<DD>
|
||||
This command applies to the next
|
||||
<CODE>E</CODE>,
|
||||
<CODE>S</CODE>,
|
||||
<CODE>T</CODE>
|
||||
or, if the
|
||||
<CODE>-onotation-sysid</CODE>
|
||||
option was specified,
|
||||
<CODE>N</CODE>
|
||||
command and specifies the system identifier
|
||||
generated by the entity manager from the specified external identifier
|
||||
and other information about the entity or notation.
|
||||
<DT>
|
||||
<CODE>{<VAR>ename</VAR></CODE>
|
||||
<DD>
|
||||
The start of the SGML subdocument entity
|
||||
<CODE><VAR>ename</VAR></CODE>;
|
||||
<CODE><VAR>ename</VAR></CODE>
|
||||
will have been defined using a
|
||||
<CODE>S</CODE>
|
||||
command.
|
||||
<DT>
|
||||
<CODE>}<VAR>ename</VAR></CODE>
|
||||
<DD>
|
||||
The end of the SGML subdocument entity
|
||||
<CODE><VAR>ename</VAR></CODE>.
|
||||
<DT>
|
||||
<CODE>L<VAR>lineno</VAR> <VAR>file</VAR></CODE>
|
||||
<DT>
|
||||
<CODE>L<VAR>lineno</VAR></CODE>
|
||||
<DD>
|
||||
Set the current line number and filename.
|
||||
The
|
||||
<CODE><VAR>file</VAR></CODE>
|
||||
argument will be omitted if only the line number has changed.
|
||||
This will be output only if the
|
||||
<CODE>-l</CODE>
|
||||
option has been given.
|
||||
<DT>
|
||||
<CODE>#<VAR>text</VAR></CODE>
|
||||
<DD>
|
||||
An APPINFO parameter of
|
||||
<CODE><VAR>text</VAR></CODE>
|
||||
was specified in the SGML declaration.
|
||||
This is not strictly part of the ESIS, but a structure-controlled
|
||||
application is permitted to act on it.
|
||||
No
|
||||
<CODE>#</CODE>
|
||||
command will be output if
|
||||
<CODE>APPINFO NONE</CODE>
|
||||
was specified.
|
||||
A
|
||||
<CODE>#</CODE>
|
||||
command will occur at most once,
|
||||
and may be preceded only by a single
|
||||
<CODE>L</CODE>
|
||||
command.
|
||||
<DT>
|
||||
<CODE>C</CODE>
|
||||
<DD>
|
||||
This command indicates that the document was a conforming SGML document.
|
||||
If this command is output, it will be the last command.
|
||||
An SGML document is not conforming if it references a subdocument entity
|
||||
that is not conforming.
|
||||
</DL>
|
||||
<P>
|
||||
<ADDRESS>
|
||||
James Clark<BR>
|
||||
jjc@jclark.com
|
||||
</ADDRESS>
|
||||
</BODY>
|
||||
</HTML>
|
||||
272
cde/programs/nsgmls/doc/spam.htm
Normal file
272
cde/programs/nsgmls/doc/spam.htm
Normal file
@@ -0,0 +1,272 @@
|
||||
<!-- $XConsortium: spam.htm /main/1 1996/09/22 18:18:13 rws $ -->
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<TITLE>SPAM</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
<H1>SPAM</H1>
|
||||
<H4>
|
||||
An SGML System Conforming to
|
||||
International Standard ISO 8879 --<BR>
|
||||
Standard Generalized Markup Language
|
||||
</H4>
|
||||
<H2>
|
||||
SYNOPSIS
|
||||
</H2>
|
||||
<P>
|
||||
<CODE>spam</CODE>
|
||||
[
|
||||
<CODE>-Cehilprvx</CODE>
|
||||
]
|
||||
[
|
||||
<CODE>-c<VAR>catalog_file</VAR></CODE>
|
||||
]
|
||||
[
|
||||
<CODE>-D<VAR>directory</VAR></CODE>
|
||||
]
|
||||
[
|
||||
<CODE>-f<VAR>file</VAR></CODE>
|
||||
]
|
||||
[
|
||||
<CODE>-m<VAR>markup_option</VAR></CODE>
|
||||
]
|
||||
[
|
||||
<CODE>-o<VAR>entity_name</VAR></CODE>
|
||||
]
|
||||
[
|
||||
<CODE>-w<VAR>warning_type</VAR></CODE>
|
||||
]
|
||||
<CODE><VAR>sysid...</VAR></CODE>
|
||||
<H2>DESCRIPTION</H2>
|
||||
<P>
|
||||
Spam (SP Add Markup)
|
||||
is an SGML markup stream editor implemented using the SP parser.
|
||||
Spam parses the SGML document contained in
|
||||
<CODE><VAR>sysid...</VAR></CODE>
|
||||
and copies to the standard output
|
||||
the portion of the document entity containing the document
|
||||
instance, adding or changing markup as specified by the
|
||||
<CODE>-m</CODE> options.
|
||||
The <CODE>-p</CODE>
|
||||
option can be used to include the SGML declaration and prolog
|
||||
in the output.
|
||||
The <CODE>-o</CODE>
|
||||
option can be used to output other entities.
|
||||
The
|
||||
<CODE>-x</CODE>
|
||||
option can be used to expand entity references.
|
||||
<P>
|
||||
The following options are available:
|
||||
<DL>
|
||||
<DT>
|
||||
<CODE>-c<VAR>file</VAR></CODE>
|
||||
<DD>
|
||||
Use the catalog entry file
|
||||
<CODE><VAR>file</VAR></CODE>.
|
||||
<DT>
|
||||
<CODE>-C</CODE>
|
||||
<DD>
|
||||
This has the same effect as in <A HREF="nsgmls#optC">nsgmls</A>.
|
||||
<DT>
|
||||
<CODE>-D<VAR>directory</VAR></CODE>
|
||||
<DD>
|
||||
Search
|
||||
<CODE><VAR>directory</VAR></CODE>
|
||||
for files specified in system identifiers.
|
||||
This has the same effect as in <A HREF="nsgmls.htm#optD">nsgmls</A>.
|
||||
<DT>
|
||||
<CODE>-e</CODE>
|
||||
<DD>
|
||||
Describe open entities in error messages.
|
||||
<DT>
|
||||
<CODE>-f<VAR>file</VAR></CODE>
|
||||
<DD>
|
||||
Redirect errors to
|
||||
<CODE><VAR>file</VAR></CODE>.
|
||||
This is useful mainly with shells that do not support redirection
|
||||
of stderr.
|
||||
<DT>
|
||||
<CODE>-h</CODE>
|
||||
<DD>
|
||||
Hoist omitted tags out from the start of internal entities.
|
||||
If the text at the beginning of an internal entity causes
|
||||
a tag to be implied,
|
||||
the tag will usually be treated as being in that internal entity;
|
||||
this option will instead cause it to be treated as being in the entity
|
||||
that referenced the internal entity.
|
||||
This option makes a difference in conjunction with
|
||||
<CODE>-momittag</CODE>
|
||||
or
|
||||
<CODE>-x -x</CODE>.
|
||||
<DT>
|
||||
<CODE>-i<VAR>name</VAR></CODE>
|
||||
<DD>
|
||||
This has the same effect as in <A HREF="nsgmls.htm#opti">nsgmls</A>.
|
||||
<DT>
|
||||
<CODE>-l</CODE>
|
||||
<DD>
|
||||
Prefer lower-case.
|
||||
Added names that were subject to upper-case substitution
|
||||
will be converted to lower-case.
|
||||
<DT>
|
||||
<CODE>-m<VAR>markup_option</VAR></CODE>
|
||||
<DD>
|
||||
Change the markup in the output according to the value
|
||||
of
|
||||
<CODE><VAR>markup_option</VAR></CODE>
|
||||
as follows:
|
||||
<DL>
|
||||
<DT>
|
||||
<CODE>omittag</CODE>
|
||||
<DD>
|
||||
Add tags that were omitted using omitted tag minimization.
|
||||
End tags that were omitted because the element has
|
||||
a declared content of <SAMP>EMPTY</SAMP>
|
||||
or an explicit content reference
|
||||
will not be added.
|
||||
<DT>
|
||||
<CODE>shortref</CODE>
|
||||
<DD>
|
||||
Replace short references by named entity references.
|
||||
<DT>
|
||||
<CODE>net</CODE>
|
||||
<DD>
|
||||
Change null end-tags
|
||||
into unminimized end-tags,
|
||||
and change net-enabling start-tags
|
||||
into unminimized start-tags.
|
||||
<DT>
|
||||
<CODE>emptytag</CODE>
|
||||
<DD>
|
||||
Change empty tags into unminimized tags.
|
||||
<DT>
|
||||
<CODE>unclosed</CODE>
|
||||
<DD>
|
||||
Change unclosed tags into unminimized tags.
|
||||
<DT>
|
||||
<CODE>attname</CODE>
|
||||
<DD>
|
||||
Add omitted attribute names and
|
||||
<CODE>vi</CODE>s.
|
||||
<DT>
|
||||
<CODE>attvalue</CODE>
|
||||
<DD>
|
||||
Add literal delimiters omitted from attribute values.
|
||||
<DT>
|
||||
<CODE>attspec</CODE>
|
||||
<DD>
|
||||
Add omitted attribute specifications.
|
||||
<DT>
|
||||
<CODE>current</CODE>
|
||||
<DD>
|
||||
Add omitted attribute specifications for current attributes.
|
||||
This option is implied by the
|
||||
<CODE>attspec</CODE>
|
||||
option.
|
||||
<DT>
|
||||
<CODE>shorttag</CODE>
|
||||
<DD>
|
||||
Equivalent to combination of
|
||||
<CODE>net</CODE>,
|
||||
<CODE>emptytag</CODE>,
|
||||
<CODE>unclosed</CODE>,
|
||||
<CODE>attname</CODE>,
|
||||
<CODE>attvalue</CODE>
|
||||
and
|
||||
<CODE>attspec</CODE>
|
||||
options.
|
||||
<DT>
|
||||
<CODE>rank</CODE>
|
||||
<DD>
|
||||
Add omitted rank suffixes.
|
||||
<DT>
|
||||
<CODE>reserved</CODE>
|
||||
<DD>
|
||||
Put reserved names in upper-case.
|
||||
<DT>
|
||||
<CODE>ms</CODE>
|
||||
<DD>
|
||||
Remove marked section declarations whose effective status
|
||||
is IGNORE, and replace each marked section declaration
|
||||
whose effective status is INCLUDE by its marked section.
|
||||
In the document instance, empty comments will be added
|
||||
before or after the marked section declaration to ensure
|
||||
that ignored record ends remain ignored.
|
||||
</DL>
|
||||
<P>
|
||||
Multiple
|
||||
<CODE>-m</CODE>
|
||||
options are allowed.
|
||||
<DT>
|
||||
<CODE>-o<VAR>name</VAR></CODE>
|
||||
<DD>
|
||||
Output the general entity
|
||||
<CODE><VAR>name</VAR></CODE>
|
||||
instead of the document entity.
|
||||
The output will correspond to the first time
|
||||
that the entity is referenced in content.
|
||||
<DT>
|
||||
<CODE>-p</CODE>
|
||||
<DD>
|
||||
Output the part of the document entity containing the SGML declaration
|
||||
(if it was explicitly present in the document entity)
|
||||
and the prolog before anything else.
|
||||
If this option is specified two or more times,
|
||||
then all entity references occurring between declarations
|
||||
in the prolog will be expanded;
|
||||
this includes the implicit reference to the entity
|
||||
containing the external subset of the DTD, if there is one.
|
||||
Note that the SGML declaration will not be included if it was
|
||||
specified by an SGMLDECL entry in a catalog.
|
||||
<DT>
|
||||
<CODE>-r</CODE>
|
||||
<DD>
|
||||
Don't perform any conversion on RSs and REs when outputting the entity.
|
||||
The entity would typically have the storage manager attribute
|
||||
<CODE>records=asis</CODE>.
|
||||
<DT>
|
||||
<CODE>-v</CODE>
|
||||
<DD>
|
||||
Print the version number.
|
||||
<DT>
|
||||
<CODE>-w<VAR>type</VAR></CODE>
|
||||
<DD>
|
||||
Control warnings and errors according to
|
||||
<CODE><VAR>type</VAR></CODE>.
|
||||
This has the same effect as in <A HREF="nsgmls.htm#optw">nsgmls</A>.
|
||||
<DT>
|
||||
<CODE>-x</CODE>
|
||||
<DD>
|
||||
Expand references to entities that are changed.
|
||||
If this option is specified two or more times,
|
||||
then all references to entities that contain tags
|
||||
will be expanded.
|
||||
</DL>
|
||||
|
||||
<H2>BUGS</H2>
|
||||
<P>
|
||||
Omitted tags are added at the point where they are
|
||||
implied by the SGML parser (except as modified
|
||||
by the
|
||||
<CODE>-h</CODE>
|
||||
option); this is often not quite where they are wanted.
|
||||
<P>
|
||||
The case of general delimiters is not preserved.
|
||||
<P>
|
||||
Incorrect results may be produced if a variant concrete syntax is used
|
||||
which is such that there are delimiters in markup to be added that have a
|
||||
prefix that is a proper suffix of some other delimiter.
|
||||
<P>
|
||||
If an entity reference in a default value uses the default entity and
|
||||
an entity with that name is subsequently defined and that default
|
||||
value is added to the document instance, then the resulting document
|
||||
may not be equivalent to the original document.
|
||||
Spam will give a warning when the first two conditions are met.
|
||||
<P>
|
||||
<ADDRESS>
|
||||
James Clark<BR>
|
||||
jjc@jclark.com
|
||||
</ADDRESS>
|
||||
</BODY>
|
||||
</HTML>
|
||||
68
cde/programs/nsgmls/doc/spent.htm
Normal file
68
cde/programs/nsgmls/doc/spent.htm
Normal file
@@ -0,0 +1,68 @@
|
||||
<!-- $XConsortium: spent.htm /main/1 1996/09/22 18:18:33 rws $ -->
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<TITLE>SPENT</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
<H1>SPENT</H1>
|
||||
<H2>SYNOPSIS</H2>
|
||||
<P>
|
||||
<CODE>spent</CODE>
|
||||
[
|
||||
<CODE>-Crv</CODE>
|
||||
]
|
||||
[
|
||||
<CODE>-b<VAR>bctf</VAR></CODE>
|
||||
]
|
||||
[
|
||||
<CODE>-D<VAR>directory</VAR></CODE>
|
||||
]
|
||||
<CODE><VAR>sysid...</VAR></CODE>
|
||||
|
||||
<H2>DESCRIPTION</H2>
|
||||
<P>
|
||||
Spent (SGML print entity)
|
||||
prints the concatenation of the entities with
|
||||
<A HREF="sysid">system identifiers</A>
|
||||
<CODE><VAR>sysid...</VAR></CODE>
|
||||
on the standard output.
|
||||
<P>
|
||||
The following options are available:
|
||||
<DL>
|
||||
<DT>
|
||||
<CODE>-b<VAR>bctf</VAR></CODE>
|
||||
<DD>
|
||||
Use the <A HREF="sysid.htm#bctf">BCTF</A> with name
|
||||
<CODE><VAR>bctf</VAR></CODE>
|
||||
for output.
|
||||
<DT>
|
||||
<CODE>-C</CODE>
|
||||
<DD>
|
||||
This has the same effect as in <A HREF="nsgmls#optC">nsgmls</A>.
|
||||
<DT>
|
||||
<CODE>-D<VAR>directory</VAR></CODE>
|
||||
<DD>
|
||||
Search
|
||||
<CODE><VAR>directory</VAR></CODE>
|
||||
for files specified in system identifiers.
|
||||
This has the same effect as in <A HREF="nsgmls.htm#optD">nsgmls</A>.
|
||||
<DT>
|
||||
<CODE>-r</CODE>
|
||||
<DD>
|
||||
Raw output.
|
||||
Don't perform any conversion on RSs and REs when printing the entity.
|
||||
The entity would typically have the storage manager attribute
|
||||
<CODE>records=asis</CODE>.
|
||||
<DT>
|
||||
<CODE>-v</CODE>
|
||||
<DD>
|
||||
Print the version number.
|
||||
</DL>
|
||||
<P>
|
||||
<ADDRESS>
|
||||
James Clark<BR>
|
||||
jjc@jclark.com
|
||||
</ADDRESS>
|
||||
</BODY>
|
||||
</HTML>
|
||||
41
cde/programs/nsgmls/doc/sysdecl.htm
Normal file
41
cde/programs/nsgmls/doc/sysdecl.htm
Normal file
@@ -0,0 +1,41 @@
|
||||
<!-- $XConsortium: sysdecl.htm /main/1 1996/09/22 18:18:52 rws $ -->
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<TITLE>SP - System declaration</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
<H1>SP System Declaration</H1>
|
||||
<P>
|
||||
The system declaration for SP is as follows:
|
||||
<PRE>
|
||||
<!SYSTEM "ISO 8879:1986"
|
||||
CHARSET
|
||||
BASESET "ISO 646-1983//CHARSET
|
||||
International Reference Version (IRV)//ESC 2/5 4/0"
|
||||
DESCSET 0 128 0
|
||||
CAPACITY PUBLIC "ISO 8879:1986//CAPACITY Reference//EN"
|
||||
FEATURES
|
||||
MINIMIZE DATATAG NO OMITTAG YES RANK YES SHORTTAG YES
|
||||
LINK SIMPLE YES 65535 IMPLICIT YES EXPLICIT YES 1
|
||||
OTHER CONCUR NO SUBDOC YES 100 FORMAL YES
|
||||
SCOPE DOCUMENT
|
||||
SYNTAX PUBLIC "ISO 8879:1986//SYNTAX Reference//EN"
|
||||
SYNTAX PUBLIC "ISO 8879:1986//SYNTAX Core//EN"
|
||||
VALIDATE
|
||||
GENERAL YES MODEL YES EXCLUDE YES CAPACITY NO
|
||||
NONSGML YES SGML YES FORMAL YES
|
||||
SDIF
|
||||
PACK NO UNPACK NO>
|
||||
</PRE>
|
||||
<P>
|
||||
The limit for the SUBDOC parameter is memory dependent.
|
||||
<P>
|
||||
Any legal concrete syntax may be used.
|
||||
<P>
|
||||
<ADDRESS>
|
||||
James Clark<BR>
|
||||
jjc@jclark.com
|
||||
</ADDRESS>
|
||||
</BODY>
|
||||
</HTML>
|
||||
305
cde/programs/nsgmls/doc/sysid.htm
Normal file
305
cde/programs/nsgmls/doc/sysid.htm
Normal file
@@ -0,0 +1,305 @@
|
||||
<!-- $XConsortium: sysid.htm /main/1 1996/09/22 18:19:13 rws $ -->
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<TITLE>SP - System identifiers</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
<H1>System identifiers</H1>
|
||||
<P>
|
||||
There are two kinds of system identifier: formal system identifiers
|
||||
and simple system identifiers. A system identifier that does not
|
||||
start with <SAMP><</SAMP> will always be interpreted as a simple
|
||||
system identifier. A simple system identifier will always be
|
||||
interpreted either as a filename or as a URL.
|
||||
|
||||
<H2>Formal system identifiers</H2>
|
||||
<P>
|
||||
Formal system identifiers are based on the
|
||||
System Identifier facility defined in ISO/IEC 10744 (HyTime) Technical
|
||||
Corrigendum 1, Annex D.
|
||||
A system identifier that is a formal system
|
||||
identifier consists of a sequence of one or more storage object
|
||||
specifications. The objects specified by the storage object
|
||||
specifications are concatenated to form the entity. A storage object
|
||||
specification consists of an SGML start-tag in the reference concrete
|
||||
syntax followed by character data content. The generic identifier of
|
||||
the start-tag is the name of a storage manager. The content is a
|
||||
storage object identifier which identifies the storage object in a
|
||||
manner dependent on the storage manager. The start-tag can also
|
||||
specify attributes giving additional information about the storage
|
||||
object. Numeric character references are recognized in storage object
|
||||
identifiers and attribute value literals in the start-tag. Record
|
||||
ends are ignored in the storage object identifier as with SGML. A
|
||||
system identifier will be interpreted as a formal system identifier if
|
||||
it starts with a <SAMP><</SAMP> followed by a storage manager name,
|
||||
followed by either <SAMP>></SAMP> or white-space; otherwise it will be
|
||||
interpreted as a simple system identifier. A storage object
|
||||
identifier extends until the end of the system identifier or until the
|
||||
first occurrence of <SAMP><</SAMP> followed by a storage manager
|
||||
name, followed by either <SAMP>></SAMP> or white-space.
|
||||
<P>
|
||||
The following storage managers are available:
|
||||
<DL>
|
||||
<DT>
|
||||
<A NAME="osfile"><SAMP>osfile</SAMP></A>
|
||||
<DD>
|
||||
The storage object identifier is a filename. If the filename is
|
||||
relative it is resolved using a base filename. Normally the base
|
||||
filename is the name of the file in which the storage object
|
||||
identifier was specified, but this can be changed using the
|
||||
<SAMP>base</SAMP> attribute. The filename will be searched for first
|
||||
in the directory of the base filename. If it is not found there, then
|
||||
it will be searched for in directories specified with the
|
||||
<SAMP>-D</SAMP> option in the order in which they were specified on
|
||||
the command line, and then in the list of directories specified by the
|
||||
environment variable <SAMP>SGML_SEARCH_PATH</SAMP>. The list
|
||||
is separated by colons under Unix and by semi-colons under MSDOS.
|
||||
<DT>
|
||||
<SAMP>osfd</SAMP>
|
||||
<DD>
|
||||
The storage object identifier is an integer specifying a file
|
||||
descriptor. Thus a system identifier of <SAMP><osfd>0</SAMP> will
|
||||
refer to the standard input.
|
||||
<DT>
|
||||
<SAMP>url</SAMP>
|
||||
<DD>
|
||||
The storage object identifier is a URL. Only the <SAMP>http</SAMP>
|
||||
scheme is currently supported and not on all systems.
|
||||
<DT>
|
||||
<SAMP>neutral</SAMP>
|
||||
<DD>
|
||||
The storage manager is the storage manager of storage object in which
|
||||
the system identifier was specified (the <I>underlying storage
|
||||
manager</I>). However if the underlying storage manager does not
|
||||
support named storage objects (ie it is <SAMP>osfd</SAMP>), then the
|
||||
storage manager will be <SAMP>osfile</SAMP>. The storage object
|
||||
identifier is treated as a relative, hierarchical name separated by
|
||||
slashes (<SAMP>/</SAMP>) and will be transformed as appropriate for
|
||||
the underlying storage manager.
|
||||
<DT>
|
||||
<SAMP>literal</SAMP>
|
||||
<DD>
|
||||
The bit combinations of the storage object identifier are
|
||||
the contents of the storage object.
|
||||
</DL>
|
||||
<P>
|
||||
The following attributes are supported:
|
||||
<DL>
|
||||
<DT>
|
||||
<SAMP>records</SAMP>
|
||||
<DD>
|
||||
This describes how records are delimited in the storage object:
|
||||
<DL>
|
||||
<DT><SAMP>cr</SAMP>
|
||||
<DD>
|
||||
Records are terminated by a carriage return.
|
||||
<DT>
|
||||
<SAMP>lf</SAMP>
|
||||
<DD>
|
||||
Records are terminated by a line feed.
|
||||
<DT>
|
||||
<SAMP>crlf</SAMP>
|
||||
<DD>
|
||||
Records are terminated by a carriage return followed by a line feed.
|
||||
<DT>
|
||||
<SAMP>find</SAMP>
|
||||
<DD>
|
||||
Records are terminated by whichever of
|
||||
<SAMP>cr</SAMP>,
|
||||
<SAMP>lf</SAMP>
|
||||
or
|
||||
<SAMP>crlf</SAMP>
|
||||
is first encountered in the storage object.
|
||||
<DT>
|
||||
<SAMP>asis</SAMP>
|
||||
<DD>
|
||||
No recognition of records is performed.
|
||||
</DL>
|
||||
<P>
|
||||
The default is <SAMP>find</SAMP> except for NDATA entities for which
|
||||
the default is <SAMP>asis</SAMP>. This attribute is not applicable to
|
||||
the <SAMP>literal</SAMP> storage manager.
|
||||
<P>
|
||||
When records are recognized in a storage object, a record start is
|
||||
inserted at the beginning of each record, and a record end at the end
|
||||
of each record. If there is a partial record (a record that doesn't
|
||||
end with the record terminator) at the end of the entity, then a
|
||||
record start will be inserted before it but no record end will be
|
||||
inserted after it.
|
||||
<P>
|
||||
The attribute name and <SAMP>=</SAMP> can be omitted for this attribute.
|
||||
<DT>
|
||||
<SAMP>zapeof</SAMP>
|
||||
<DD>
|
||||
This specifies whether a Control-Z character that occurs as the final byte
|
||||
in the storage object should be stripped.
|
||||
The following values are allowed:
|
||||
<DL>
|
||||
<DT><SAMP>zapeof</SAMP>
|
||||
<DD>
|
||||
A final Control-Z should be stripped.
|
||||
<DT><SAMP>nozapeof</SAMP>
|
||||
<DD>
|
||||
A final Control-Z should not be stripped.
|
||||
</DL>
|
||||
<P>
|
||||
The default is <SAMP>zapeof</SAMP> except for NDATA entities, entities
|
||||
declared in storage objects with <SAMP>zapeof=nozapeof</SAMP> and
|
||||
storage objects with <SAMP>records=asis</SAMP>. This attribute is not
|
||||
applicable to the <SAMP>literal</SAMP> storage manager.
|
||||
<P>
|
||||
The attribute name and <SAMP>=</SAMP> can be omitted for this
|
||||
attribute.
|
||||
<DT>
|
||||
<A NAME="bctf"><SAMP>bctf</SAMP></A>
|
||||
<DD>
|
||||
The bctf (bit combination transformation format) attribute describes
|
||||
how the bit combinations of the storage object are transformed into
|
||||
the sequence of bytes that are contained in the object identified by
|
||||
the storage object identifier. This inverse of this transformation is
|
||||
performed when the entity manager reads the storage object. It has
|
||||
one of the following values:
|
||||
<DL>
|
||||
<DT>
|
||||
<SAMP>identity</SAMP>
|
||||
<DD>
|
||||
Each bit combination is represented by a single byte.
|
||||
<DT>
|
||||
<SAMP>fixed-2</SAMP>
|
||||
<DD>
|
||||
Each bit combination is represented by exactly 2
|
||||
bytes, with the more significant byte first.
|
||||
<DT>
|
||||
<SAMP>utf-8</SAMP>
|
||||
<DD>
|
||||
Each bit combination is represented by a variable number of bytes
|
||||
according to UCS Transformation Format 8 defined in Annex P to be
|
||||
added by the first proposed drafted amendment (PDAM 1) to ISO/IEC
|
||||
10646-1:1993.
|
||||
<DT>
|
||||
<SAMP>euc-jp</SAMP>
|
||||
<DD>
|
||||
Each bit combination is treated as a pair of bytes, most significant
|
||||
byte first, encoding a character using the
|
||||
Extended_UNIX_Code_Fixed_Width_for_Japanese Internet charset, and is
|
||||
transformed into the variable length sequence of octets that would
|
||||
encode that character using the
|
||||
Extended_UNIX_Code_Packed_Format_for_Japanese Internet charset.
|
||||
<DT>
|
||||
<SAMP>sjis</SAMP>
|
||||
<DD>
|
||||
Each bit combination is treated as a pair of bytes, most significant
|
||||
byte first, encoding a character using the
|
||||
Extended_UNIX_Code_Fixed_Width_for_Japanese Internet charset, and is
|
||||
transformed into the variable length sequence of bytes that would
|
||||
encode that character using the Shift_JIS Internet charset.
|
||||
<DT>
|
||||
<SAMP>unicode</SAMP>
|
||||
<DD>
|
||||
Each bit combination is represented by 2 bytes. The bytes
|
||||
representing the entire storage object may be preceded by a pair of
|
||||
bytes representing the byte order mark character (0xFEFF). The bytes
|
||||
representing each bit combination are in the system byte order, unless
|
||||
the byte order mark character is present, in which case the order of
|
||||
its bytes determines the byte order. When the storage object is read,
|
||||
any byte order mark character is discarded.
|
||||
<DT>
|
||||
<SAMP>is8859-<VAR>n</VAR></SAMP>
|
||||
<DD>
|
||||
<SAMP><VAR>n</VAR></SAMP> can be any single digit other than 0. Each
|
||||
bit combination is interpreted as the number of a character in ISO/IEC
|
||||
10646 and is represented by the single byte that would encode that
|
||||
character in ISO 8859-<VAR>n</VAR>. These values are not supported
|
||||
with the <SAMP>-b</SAMP> option.
|
||||
</DL>
|
||||
<P>
|
||||
Values other than <SAMP>identity</SAMP> are supported only with the
|
||||
multi-byte version of nsgmls. This attribute is not applicable to the
|
||||
<SAMP>literal</SAMP> storage manager.
|
||||
<DT>
|
||||
<SAMP>tracking</SAMP>
|
||||
<DD>
|
||||
This specifies whether line boundaries should be tracked for this
|
||||
object: a value of <SAMP>track</SAMP> specifies that they should; a
|
||||
value of <SAMP>notrack</SAMP> specifies that they should not. The
|
||||
default value is <SAMP>track</SAMP>. Keeping track of where line
|
||||
boundaries occur in a storage object requires approximately one byte
|
||||
of storage per line and it may be desirable to disable this for very
|
||||
large storage objects.
|
||||
<P>
|
||||
The attribute name and
|
||||
<SAMP>=</SAMP>
|
||||
can be omitted for this attribute.
|
||||
<DT>
|
||||
<SAMP>base</SAMP>
|
||||
<DD>
|
||||
When the storage object identifier specified in the content of the
|
||||
storage object specification is relative, this specifies the base
|
||||
storage object identifier relative to which that storage object
|
||||
identifier should be resolved.
|
||||
When not specified a storage object identifier is interpreted
|
||||
relative to the storage object in which it is specified,
|
||||
provided that this has the same storage manager.
|
||||
This applies both to system identifiers specified in SGML
|
||||
documents and to system identifiers specified in the catalog entry
|
||||
files.
|
||||
<DT>
|
||||
<SAMP>smcrd</SAMP>
|
||||
<DD>
|
||||
The value is a single character that will be recognized in storage
|
||||
object identifiers (both in the content of storage object
|
||||
specifications and in the value of <SAMP>base</SAMP> attributes) as a
|
||||
storage manager character reference delimiter when followed by a
|
||||
digit. A storage manager character reference is like an SGML numeric
|
||||
character reference except that the number is interpreted as a
|
||||
character number in the inherent character set of the storage manager
|
||||
rather than the document character set. The default is for no
|
||||
character to be recognized as a storage manager character reference
|
||||
delimiter. Numeric character references cannot be used to prevent
|
||||
recognition of storage manager character reference delimiters.
|
||||
<DT>
|
||||
<SAMP>fold</SAMP>
|
||||
<DD>
|
||||
This applies only to the <SAMP>neutral</SAMP> storage manager. It
|
||||
specifies whether the storage object identifier should be folded to
|
||||
the customary case of the underlying storage manager if storage object
|
||||
identifiers for the underlying storage manager are case sensitive.
|
||||
The following values are allowed:
|
||||
<DL>
|
||||
<DT><SAMP>fold</SAMP>
|
||||
<DD>
|
||||
The storage object identifier will be folded.
|
||||
<DT>
|
||||
<SAMP>nofold</SAMP>
|
||||
<DD>
|
||||
The storage object identifier will not be folded.
|
||||
</DL>
|
||||
<P>
|
||||
The default value is <SAMP>fold</SAMP>. The attribute name and
|
||||
<SAMP>=</SAMP> can be omitted for this attribute.
|
||||
<P>
|
||||
For example, on Unix filenames are case-sensitive and the customary
|
||||
case is lower-case. So if the underlying storage manager were
|
||||
<SAMP>osfile</SAMP> and the system was a Unix system, then
|
||||
<SAMP><neutral>FOO.SGM</SAMP> would be equivalent to
|
||||
<SAMP><osfile>foo.sgm</SAMP>.
|
||||
</DL>
|
||||
<H2>Simple system identfiers</H2>
|
||||
<P>
|
||||
A simple system identifier is interpreted as a storage object
|
||||
identifier with a storage manager that depends on where the system
|
||||
identifier was specified: if it was specified in a storage object
|
||||
whose storage manager was <SAMP>url</SAMP> or if the system identifier
|
||||
looks like an absolute URL in a supported scheme, the storage manager
|
||||
will be <SAMP>url</SAMP>; otherwise the storage manager will be
|
||||
<SAMP>osfile</SAMP>. The storage manager attributes are defaulted as
|
||||
for a formal system identifier. Numeric character references are not
|
||||
recognized in simple system identifiers.
|
||||
<P>
|
||||
<ADDRESS>
|
||||
James Clark<BR>
|
||||
jjc@jclark.com
|
||||
</ADDRESS>
|
||||
</BODY>
|
||||
</HTML>
|
||||
53
cde/programs/nsgmls/doc/winntu.htm
Normal file
53
cde/programs/nsgmls/doc/winntu.htm
Normal file
@@ -0,0 +1,53 @@
|
||||
<!-- $XConsortium: winntu.htm /main/1 1996/09/22 18:19:32 rws $ -->
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<TITLE>SP Unicode support under Windows NT</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
<H1>Notes on SP Unicode support under Windows NT</H1>
|
||||
<P>
|
||||
When compiled with the appropriate preprocessor definition
|
||||
(<CODE>UNICODE</CODE>), SP now uses Unicode interfaces to NT. This
|
||||
means that the <SAMP>SP_BCTF</SAMP> environment variable applies only
|
||||
to file input and output, and so <CODE>unicode</CODE> is allowed as
|
||||
the value of <SAMP>SP_BCTF</SAMP>.
|
||||
<P>
|
||||
In order for non-ASCII characters to be correctly displayed on your
|
||||
console you must select a TrueType font, such as Lucida Console, as your
|
||||
console font.
|
||||
<P>
|
||||
If you define your own public character sets, you should use Unicode
|
||||
(or a superset of Unicode) as your universal character set.
|
||||
<P>
|
||||
The following additional BCTFs are supported:
|
||||
<DL>
|
||||
<DT>
|
||||
<SAMP>windows</SAMP>
|
||||
<DD>
|
||||
Specify this BCTF when a storage object is encoded using your
|
||||
system's default Windows character set, and your document character
|
||||
set is declared as Unicode. This uses the so-called ANSI code page.
|
||||
<DT>
|
||||
<SAMP>wunicode</SAMP>
|
||||
<DD>
|
||||
This uses the <SAMP>unicode</SAMP> BCTF if the storage object starts
|
||||
with a byte order mark and otherwise the <SAMP>windows</SAMP> BCTF.
|
||||
If you are working with Unicode, this is probably the best value
|
||||
for <SAMP>SP_BCTF</SAMP>.
|
||||
<DT>
|
||||
<SAMP>ms-dos</SAMP>
|
||||
<DD>
|
||||
Specify this BCTF when a storage object (file) uses the OEM code page,
|
||||
and your document character set is declared as Unicode.
|
||||
The OEM code-page for a particular
|
||||
machine is the code-page used by FAT file-systems on that machine and
|
||||
is the default code-page for MS-DOS consoles.
|
||||
</DL>
|
||||
<P>
|
||||
<ADDRESS>
|
||||
James Clark<BR>
|
||||
jjc@jclark.com
|
||||
</ADDRESS>
|
||||
</BODY>
|
||||
</HTML>
|
||||
Reference in New Issue
Block a user