Initial import of the CDE 2.1.30 sources from the Open Group.

This commit is contained in:
Peter Howkins
2012-03-10 18:21:40 +00:00
commit 83b6996daa
18978 changed files with 3945623 additions and 0 deletions

View File

@@ -0,0 +1,21 @@
<!-- $XConsortium: BEntity.sgm /main/7 1996/10/29 17:43:13 drk $ -->
<!ENTITY IPG.intro.fig.1 SYSTEM "./i18nGuide/graphics/inint1.cgm" NDATA CGM-BINARY>
<!ENTITY IPG.intro.fig.2 SYSTEM "./i18nGuide/graphics/inint2.cgm" NDATA CGM-BINARY>
<!ENTITY IPG.intro.fig.3 SYSTEM "./i18nGuide/graphics/inint3.tif" NDATA TIFF>
<!ENTITY IPG.intro.fig.4 SYSTEM "./i18nGuide/graphics/inint4.tif" NDATA TIFF>
<!ENTITY IPG.intro.fig.5 SYSTEM "./i18nGuide/graphics/inint5.tif" NDATA TIFF>
<!ENTITY IPG.intro.fig.6 SYSTEM "./i18nGuide/graphics/inint6.tif" NDATA TIFF>
<!ENTITY IPG.distr.fig.1 SYSTEM "./i18nGuide/graphics/ind1.cgm" NDATA CGM-BINARY>
<!ENTITY IPG.motif.fig.3 SYSTEM "./i18nGuide/graphics/inmot3.tif" NDATA TIFF>
<!ENTITY IPG.motif.fig.4 SYSTEM "./i18nGuide/graphics/inmot4.tif" NDATA TIFF>

View File

@@ -0,0 +1,3 @@
/* $XConsortium: Title.tmpl /main/2 1996/06/19 16:03:55 drk $ */
/* TOC title, only what's between quotes should be modified. */
title = "Internationalization Programmer's Guide"

View File

@@ -0,0 +1,54 @@
<!-- $XConsortium: adbook.sgm /main/10 1996/08/17 18:30:51 rws $ -->
<!DOCTYPE DocBook PUBLIC "-//HaL and O'Reilly//DTD DocBook V2.2.1//EN"[
<!ENTITY IPG.intro.fig.1 SYSTEM "./graphics/inint1.cgm" NDATA CGM-BINARY>
<!ENTITY IPG.intro.fig.2 SYSTEM "./graphics/inint2.cgm" NDATA CGM-BINARY>
<!ENTITY IPG.intro.fig.3 SYSTEM "./graphics/inint3.tif" NDATA TIFF>
<!ENTITY IPG.intro.fig.4 SYSTEM "./graphics/inint4.tif" NDATA TIFF>
<!ENTITY IPG.intro.fig.5 SYSTEM "./graphics/inint5.tif" NDATA TIFF>
<!ENTITY IPG.intro.fig.6 SYSTEM "./graphics/inint6.tif" NDATA TIFF>
<!ENTITY IPG.distr.fig.1 SYSTEM "./graphics/ind1.cgm" NDATA CGM-BINARY>
<!ENTITY IPG.motif.fig.3 SYSTEM "./graphics/inmot3.tif" NDATA TIFF>
<!ENTITY IPG.motif.fig.4 SYSTEM "./graphics/inmot4.tif" NDATA TIFF>
<!ENTITY MotifProgGd "<Emphasis>Motif Programmer's Guide</Emphasis>">
<!ENTITY Pref SYSTEM "./preface.sgm">
<!ENTITY intro SYSTEM "./ch01.sgm">
<!ENTITY deskt SYSTEM "./ch02.sgm">
<!ENTITY distr SYSTEM "./ch03.sgm">
<!ENTITY motif SYSTEM "./ch04.sgm">
<!ENTITY msgs SYSTEM "./appa.sgm">
]>
<!-- ____________________________________________________________________________ -->
<DocBook>
<Book>
<Title>Common Desktop Environment: Internationalization Programmer's Guide</Title>
&Pref;
&intro;
&deskt;
&distr;
&motif;
&msgs;
</Book>
</DocBook>

View File

@@ -0,0 +1,443 @@
<!-- $XConsortium: appa.sgm /main/10 1996/10/30 14:56:04 rws $ -->
<!-- (c) Copyright 1995 Digital Equipment Corporation. -->
<!-- (c) Copyright 1995 Hewlett-Packard Company. -->
<!-- (c) Copyright 1995 International Business Machines Corp. -->
<!-- (c) Copyright 1995 Sun Microsystems, Inc. -->
<!-- (c) Copyright 1995 Novell, Inc. -->
<!-- (c) Copyright 1995 FUJITSU LIMITED. -->
<!-- (c) Copyright 1995 Hitachi. -->
<appendix id="IPG.msgs.div.1">
<title id="IPG.msgs.mkr.1">Message Guidelines</title>
<para>Refer to the information in this appendix to write messages that are
easily internationlized.</para>
<informaltable id="IPG.msgs.itbl.1" frame="All">
<tgroup cols="1">
<colspec colname="1" colwidth="4.0 in">
<tbody>
<row rowsep="1">
<entry><para><!--Original XRef content: 'Refer to the information in this
appendix to write messages that are easily internationlized.127'--><xref
role="JumpText" linkend="IPG.msgs.mkr.2"></para></entry></row>
<row rowsep="1">
<entry><para><!--Original XRef content: 'Cause and Recovery Information128'--><xref
role="JumpText" linkend="IPG.msgs.mkr.3"></para></entry></row>
<row rowsep="1">
<entry><para><!--Original XRef content: 'Comment Lines for Translators128'--><xref
role="JumpText" linkend="IPG.msgs.mkr.4"></para></entry></row>
<row rowsep="1">
<entry><para><!--Original XRef content: 'Writing Style129'--><xref role="JumpText"
linkend="IPG.msgs.mkr.5"></para></entry></row>
<row rowsep="1">
<entry><para><!--Original XRef content: 'Usage Statements131'--><xref role="JumpText"
linkend="IPG.msgs.mkr.6"></para></entry></row>
<row rowsep="1">
<entry><para><!--Original XRef content: 'Regular Expression Standard Messages134'--><xref
role="JumpText" linkend="IPG.msgs.mkr.7"></para></entry></row>
<row rowsep="1">
<entry><para><!--Original XRef content: 'Sample Messages135'--><xref role="JumpText"
linkend="IPG.msgs.mkr.8"></para></entry></row></tbody></tgroup>
</informaltable>
<para id="IPG.msgs.mkr.2"></para>
<sect1 id="IPG.msgs.div.2">
<title>File-Naming<indexterm><primary>file, naming conventions</primary>
</indexterm> Conventions</title>
<para>The conventions<indexterm><primary>messages</primary><secondary>file-naming
conventions</secondary></indexterm> used in naming files with user messages
are discussed here. Usually, the message source file has the suffix <computeroutput>.msg</computeroutput>; the generated message catalog has the suffix <computeroutput>.cat</computeroutput>. There may be other such files related to messages.
The following criteria must be met for a file to have these suffixes:</para>
<itemizedlist remap="Bullet1"><listitem><para>It is X/Open-compliant.</para>
</listitem><listitem><para>It becomes a <computeroutput>*.cat</computeroutput>
file through the use of the <computeroutput>gencat</computeroutput> command.
</para>
</listitem></itemizedlist>
</sect1>
<sect1 id="IPG.msgs.div.3">
<title id="IPG.msgs.mkr.3">Cause and Recovery<indexterm><primary>messages</primary><secondary>cause and recovery information</secondary></indexterm> Information</title>
<para>Whenever possible, explain to users exactly what has happened and what
they can do to remedy the situation.</para>
<para>The message <command>Bad arg</command> is not very helpful. However,
the following message tells users exactly what to do to make the command
work:</para>
<programlisting>Do not specify more than 2 files on the command line</programlisting>
<para>Similarly, the message <command>Line too long</command> does not give
users recovery information. However, the following message gives users more
specific recovery information:</para>
<programlisting>Line cannot exceed 20 characters</programlisting>
<para>If detailed recovery information is necessary for a given error message,
add it to the appropriate place in online information or help.</para>
<para>See <!--Original XRef content: '&xd2;Sample Messages&xd3; on page&numsp;135'--><xref
role="SecTitleAndPageNum" linkend="IPG.msgs.mkr.8"> for samples of original
and rewritten messages.</para>
</sect1>
<sect1 id="IPG.msgs.div.4">
<title id="IPG.msgs.mkr.4">Comment Lines for Translators</title>
<para>A message<indexterm><primary>messages</primary><secondary>comment
lines for translators</secondary></indexterm> source file should contain comments
to help the translator in the process of translation. These comments will
not be part of the message catalog generated. The comments are similar to
C language comments to help document a program. A dollar sign ($) followed
by a space will be interpreted by the translation tool and the <computeroutput>gencat</computeroutput> command as comments. The following is an example
of a comment line in a message source file.</para>
<programlisting>$ This is a comment</programlisting>
<para>Use comment lines to tell translators and writers what variables, such
as <emphasis>%s</emphasis>, <emphasis>%c</emphasis>, and <emphasis>%d,</emphasis>
represent. For example, note whether the variable refers to such things as
a user, file, directory, or flag.</para>
<para>Place the comment line directly beneath the message to which it refers,
rather than at the bottom of the message catalog. Global comments for an
entire set can be placed directly below the $set directive in the source
file.</para>
<para>Specify in a comment line any messages within the message catalog that
are obsolete.</para>
</sect1>
<sect1 id="IPG.msgs.div.5">
<title>Programming Format<indexterm><primary>messages</primary><secondary>programming format</secondary></indexterm></title>
<para>For the programming format of messages, see the following list.</para>
<itemizedlist remap="Bullet1"><listitem><para>Do not construct messages from
clauses. Use flags or other means within the program to pass information
so that a complete message can be issued at the proper time.</para>
</listitem><listitem><para>Do not use hardcoded English text as a variable
for a <emphasis>%s</emphasis> string in an existing message. This is also
the construction of messages and is not translatable.</para>
</listitem><listitem><para>Capitalize the first word of the sentence, and
use a period at the end of the sentence or phrase.</para>
</listitem><listitem><para>End the last line of the message with
(backslash followed by a lowercase n, indicating a new line). This also applies
to one-line messages.</para>
</listitem><listitem><para>Begin the second and remaining lines of a message
with <computeroutput>\t</computeroutput> (backslash
followed by a lowercase t, indicating a tab).</para>
</listitem><listitem><para>End all other lines with <computeroutput>\n\</computeroutput>
(backslash followed by a lowercase n, followed by another backslash, indicating
a new line).</para>
</listitem><listitem><para>If, for some reason, the message should not end
with a new line, use a comment to tell the writers.</para>
</listitem><listitem><para>Precede each message with the name of the command
that called the message, followed by a colon. The command name should precede
the component number in error messages. The command name is shown in the
following example as it should appear in a message:</para>
<programlisting>>OPIE &ldquo;foo: Opening the file.&rdquo;
</programlisting>
</listitem></itemizedlist>
</sect1>
<sect1 id="IPG.msgs.div.6">
<title id="IPG.msgs.mkr.5">Writing Style</title>
<para>The following guidelines on the writing style of messages include terminology,
punctuation, mood, voice, tense, capitalization, and other usage questions.<indexterm>
<primary>messages</primary><secondary>writing style in</secondary></indexterm></para>
<itemizedlist remap="Bullet1"><listitem><para>Use sentence format. One-line
and one-sentence messages are preferable.</para>
</listitem><listitem><para>Add articles (<emphasis>a</emphasis>, <emphasis>an</emphasis>, <emphasis>the</emphasis>) when necessary to eliminate ambiguity.
</para>
</listitem><listitem><para>Capitalize the first word of the sentence and use
a period at the end.</para>
</listitem><listitem><para>Use the present tense. Do not allow future tense
in a message. For example, use the sentence:</para>
<para remap="CodeIndent1"><computeroutput>The foo command displays a calendar.</computeroutput></para>
<para>Instead of:</para>
<para remap="CodeIndent1"><computeroutput>The foo command will display a calendar.</computeroutput></para>
</listitem><listitem><para>Do not use the first person (<emphasis>I</emphasis>
or <emphasis>we</emphasis>) anywhere in messages.</para>
</listitem><listitem><para>Avoid using the second person.</para>
<para>Do not use the word <emphasis>you</emphasis> except in help and interactive
text.</para>
</listitem><listitem><para>Use active voice. The first line is the original
message. The second line is the preferred wording.</para>
<para remap="CodeIndent1"><computeroutput>MYNUM &ldquo;Month and year must
be entered as numbers.&rdquo;MYNUM &ldquo;foo: 7777-222 Enter month and year
as numbers.\n&rdquo;</computeroutput></para>
<para>7777-222 is the message ID.</para>
</listitem><listitem><para>Use the imperative mood (command phrase) and active
verbs: <emphasis>specify</emphasis>, <emphasis>use</emphasis>, <emphasis>check</emphasis>, <emphasis>choose</emphasis>, and <emphasis>wait</emphasis>
are examples.</para>
</listitem><listitem><para>State messages in a positive tone. The first line
is the original message. The second line is the preferred wording.</para>
<para remap="CodeIndent1"><computeroutput>BADL &ldquo;Don't use the f option
more than once.&rdquo;BADL &ldquo;foo: 7777-009 Use the -f flag only once.\n&rdquo;</computeroutput></para>
</listitem><listitem><para>Do not use nouns as verbs. Use words only in the
grammatical categories shown in the dictionary. If a word is shown only as
a noun, do not use it as a verb. For example, do not <emphasis>solution</emphasis>
a problem (or, for that matter, <emphasis>architect</emphasis> a system).
</para>
</listitem><listitem><para>Do not use prefixes or suffixes. Translators may
not understand words beginning with <emphasis>re-</emphasis>, <emphasis>un-</emphasis>, <emphasis>in-</emphasis>, or <emphasis>non-</emphasis>, and
the translations of messages that use these prefixes or suffixes may not
have the meaning you intended. Exceptions to this rule occur when the prefix
is an integral part of a commonly used word. The words <emphasis>previous</emphasis> and <emphasis>premature</emphasis> are acceptable; the word <emphasis>nonexistent,</emphasis> is not.</para>
</listitem><listitem><para>Do not use plurals. Do not use parentheses to show
singular or plural, as in <emphasis>error(s),</emphasis> which cannot be
translated. If you must show singular and plural, write <emphasis>error or
errors</emphasis>. A better way is to condition the code so that two different
messages are issued depending on whether the singular or plural of a word
is required.</para>
</listitem><listitem><para>Do not use contractions. Use the single word <emphasis>cannot</emphasis> to denote something the system is unable to do.</para>
</listitem><listitem><para>Do not use quotation marks. This includes both
single and double quotation marks. For example, do not use quotation marks
around variables such as <emphasis>%s</emphasis>, <emphasis>%c</emphasis>,
and <emphasis>%d</emphasis> or around commands. Users may take the quotation
marks literally.</para>
</listitem><listitem><para>Do not hyphenate words at the end of lines.</para>
</listitem><listitem><para>Do not use the standard highlighting guidelines
in messages, and do not substitute initial or all caps for other highlighting
practices.</para>
</listitem><listitem><para>Do not use <emphasis>and/or</emphasis>. This construction
does not exist in other languages. Usually it is better to say <emphasis>or</emphasis> to indicate that it is not necessary to do both.</para>
</listitem><listitem><para>Use the 24-hour clock. Do not use <emphasis>a.m.</emphasis> or <emphasis>p.m.</emphasis> to specify time. For example, write <emphasis>1:00 p.m.</emphasis> as <emphasis>1300</emphasis>.</para>
</listitem><listitem><para>Avoid acronyms. Only use acronyms that are better
known to your audience than their spelled-out versions. To make a plural
of an acronym, add a lowercase <symbol role="Variable">s</symbol>, without
an apostrophe. Verify that it is not a trademark before using it.</para>
</listitem><listitem><para>Avoid the &ldquo;no-no&rdquo; words. Examples are
<emphasis>abort</emphasis>, <emphasis>argument</emphasis>,
and <emphasis>execute</emphasis>. See the project glossary.</para>
</listitem><listitem><para>Retain meaningful terminology. Keep as much of
the original message text as possible while ensuring that the message is
meaningful and translatable.</para>
</listitem></itemizedlist>
</sect1>
<sect1 id="IPG.msgs.div.7">
<title id="IPG.msgs.mkr.6">Usage Statements</title>
<para>The<indexterm><primary>messages</primary><secondary>usage statements
in</secondary></indexterm> usage statement is generated by commands when at
least one flag that is not valid has been included in the command line. The
usage statement must not be used if only the data associated with a flag
is missing or incorrect. If this occurs, an error message unique to the problem
is used.</para>
<itemizedlist remap="Bullet1"><listitem><para>Show the command syntax in the
usage statement. For example, a possible usage statement for the <computeroutput>del</computeroutput> command reads:</para>
<para remap="CodeIndent1"><computeroutput>Usage: del {File ...|-}</computeroutput></para>
</listitem><listitem><para>Clauses defining the purpose of a command are to
be removed.</para>
</listitem><listitem><para>Capitalize the first letter of such words (parameters)
as <emphasis>File, Directory, String, Number,</emphasis> and so on only when
used in a usage statement.</para>
</listitem><listitem><para>Do not abbreviate parameters on the command line.
It may be perfectly obvious to experienced users that <emphasis>Num</emphasis>
means <emphasis>Number</emphasis>, but spell it out to ensure correct translation.
</para>
</listitem><listitem><para><indexterm><primary>usage statements, delimiters</primary></indexterm>Use only the following delimiters in usage statements:
</para>
<informaltable>
<tgroup cols="2" colsep="0" rowsep="0">
<colspec align="left" colwidth="100*">
<colspec align="left" colwidth="356*">
<thead>
<row><entry><para>Delimiter</para></entry><entry><para>Description</para></entry>
</row></thead>
<tbody>
<row>
<entry><para>[]</para></entry>
<entry><para>Parameter is optional.</para></entry></row>
<row>
<entry><para>{ }</para></entry>
<entry><para>There is more than one parameter choice, but one of the parameters
is required. (See the following text.)</para></entry></row>
<row>
<entry><para>|</para></entry>
<entry><para>Choose one parameter only. [a|b] indicates that you can choose <emphasis>a</emphasis> or <emphasis>b</emphasis> or neither <emphasis>a</emphasis> nor <emphasis>b</emphasis>. {a|b} indicates that you must choose either <emphasis>a</emphasis>
or <emphasis>b</emphasis>.</para></entry></row>
<row>
<entry><para>..</para></entry>
<entry><para>Parameter can be repeated on the command line. (Note that there
is a space before the ellipsis.)</para></entry></row>
<row>
<entry><para>-</para></entry>
<entry><para>Standard input.</para></entry></row></tbody></tgroup></informaltable>
</listitem><listitem><para>A usage statement parameter does not require square
brackets or braces if it is required and is the only choice, as in the following:
</para>
<para remap="CodeIndent1"><computeroutput>banner String</computeroutput></para>
</listitem><listitem><para>In usage statements, put a space between flags
that must be separated on the command line. For example:</para>
<para remap="CodeIndent1"><computeroutput>unget [-n] [-rSID] [-s] {File|-}</computeroutput></para>
</listitem><listitem><para>If flags can be used together without a separating
space, do not separate them with a space on the command line. For example:
</para>
<para remap="CodeIndent1"><computeroutput>wc [-cwl] {File ...|-}</computeroutput></para>
</listitem><listitem><para>When the order of flags on the command line does
not make a difference, put them in alphabetical order. If the case is mixed,
put lowercase versions first:</para>
<para remap="CodeIndent1"><computeroutput>get -aAijlmM</computeroutput></para>
</listitem><listitem><para>Some usage statements can be long and involved.
Use your best judgment to determine where you should end lines in the usage
statement. The following example shows an old-style usage statement for the <computeroutput>get</computeroutput> command:</para>
<para remap="CodeIndent1"><computeroutput>Usage: get [-e|-k] [-cCutoff] [-iList]
[-rSID] [-wString] [xList] [-b] [-gmnpst] [-l[p]] File ... Retrieves a specified
version of a Source Code Control System (SCCS) file.</computeroutput></para>
</listitem></itemizedlist>
</sect1>
<sect1 id="IPG.msgs.div.8">
<title>Standard Messages</title>
<para><indexterm><primary>messages</primary><secondary>punctuation and wording
guidelines</secondary></indexterm>Certain commands have standard errors defined
in POSIX.2 documentation. Follow the guidelines set up in POSIX.2, if applicable.
</para>
<itemizedlist remap="Bullet1"><listitem><para>Tell the user to <computeroutput>Press the ------ key</computeroutput> to select
a key on the keyboard, including the specific key to press (such as, <computeroutput>Press Ctrl-D</computeroutput>).</para>
</listitem><listitem><para>Unless the system is overloaded, there is no need
to tell the user to <computeroutput>Try again</computeroutput> <computeroutput>later</computeroutput>. That should be obvious from the
message.</para>
</listitem><listitem><para>When writing message text, use the word <emphasis>parameter</emphasis> to describe text on the command line; use the word
<emphasis>value</emphasis> to indicate numeric data.</para>
</listitem><listitem><para>Use the word <emphasis>flag</emphasis>
rather than the words <emphasis>command option.</emphasis></para>
</listitem><listitem><para>Do not use commas to set off the one-thousandth
place in values.</para>
</listitem><listitem><para>Do not use 1,000. Use 1000.</para>
</listitem><listitem><para>If a message must be set off with an asterisk,
use two asterisks at the beginning of the message and two asterisks at the
end of the message.</para>
<para remap="CodeIndent1"><computeroutput>** Total **</computeroutput></para>
</listitem><listitem><para>Use <emphasis>log in</emphasis> and <emphasis>log off</emphasis> as verbs.</para>
<para remap="CodeIndent1"><computeroutput>Log in to the system; enter the
data; then log off.</computeroutput></para>
</listitem><listitem><para>Use <emphasis>user name</emphasis>, <emphasis>group name</emphasis>, and <emphasis>login</emphasis> as nouns.</para>
<para remap="CodeIndent1"><computeroutput>The user name is sam. The group
name is staff. The login directory is /u/sam.</computeroutput></para>
</listitem><listitem><para>User number and group number refer to the number
associated with the user's name and group.</para>
</listitem><listitem><para>Do not use the term <emphasis>superuser</emphasis>.
The <emphasis>root user</emphasis> may not have all privileges.</para>
</listitem><listitem><para>Use the words <emphasis>command string</emphasis>
to describe the command with its parameters.</para>
</listitem><listitem><para>Many of the same messages occur frequently.<indexterm>
<primary>messages</primary><secondary>option</secondary></indexterm> Table
A-1 lists the new standard message that replaces the old message.</para>
</listitem></itemizedlist>
<table id="IPG.msgs.tbl.1" frame="Topbot">
<title>New Standard Messages</title>
<tgroup cols="2" colsep="0" rowsep="0">
<colspec colwidth="3.85in">
<colspec colwidth="2.52in">
<thead>
<row><entry align="left" valign="bottom"><para><literal>Use the Following
Standard Messages</literal></para></entry><entry align="left" valign="bottom"><para><literal>Instead of These Messages</literal></para></entry></row></thead>
<tbody>
<row>
<entry align="left" valign="top"><para><computeroutput>Cannot find or open
the file</computeroutput>.</para></entry>
<entry align="left" valign="top"><para><computeroutput>Can't open filename</computeroutput>.</para></entry></row>
<row>
<entry align="left" valign="top"><para><computeroutput>Cannot find or access
the file</computeroutput>.</para></entry>
<entry align="left" valign="top"><para>Can't access</para></entry></row>
<row>
<entry align="left" valign="top"><para><computeroutput>The syntax of a parameter
is not valid</computeroutput>.</para></entry>
<entry align="left" valign="top"><para>syntax error</para></entry></row></tbody>
</tgroup></table>
</sect1>
<sect1 id="IPG.msgs.div.9">
<title id="IPG.msgs.mkr.7">Regular Expression Standard Messages</title>
<para>Table A-2 lists the standard regular expression error messages, including
the message number associated with each regular expression error:</para>
<table id="IPG.msgs.tbl.2" frame="Topbot">
<title>Regular Expression Standard Messages</title>
<tgroup cols="3">
<colspec colname="1" colwidth="0.7338 in">
<colspec colname="2" colwidth="2.36492 in">
<colspec colname="3" colwidth="1.89039 in">
<tbody>
<row>
<entry><para><literal>Number</literal></para></entry>
<entry><para><literal>Use These Standard Messages</literal></para></entry>
<entry><para><literal>Instead of These Messages</literal></para></entry></row>
<row>
<entry><para>11</para></entry>
<entry><para>Specify a range end point that is less than 256.</para></entry>
<entry><para>Range end point too large.</para></entry></row>
<row>
<entry><para>16</para></entry>
<entry><para>The character or characters between \{ and \} must be numeric.
</para></entry>
<entry><para>Bad number.</para></entry></row>
<row>
<entry><para>25</para></entry>
<entry><para><computeroutput>Specify a \digit between 1 and 9 that is not
greater than the number of subpatterns.</computeroutput></para></entry>
<entry><para><computeroutput>\digit out of range</computeroutput>.</para></entry>
</row>
<row>
<entry><para>36</para></entry>
<entry><para>A delimiter is not correct or is missing.</para></entry>
<entry><para>Illegal or missing delimiter.</para></entry></row>
<row>
<entry><para>41</para></entry>
<entry><para><computeroutput>There is no remembered search string</computeroutput>.
</para></entry>
<entry><para>No remembered search string.</para></entry></row>
<row>
<entry><para>42</para></entry>
<entry><para>There is a missing \( or \).</para></entry>
<entry><para>\(\) imbalance.</para></entry></row>
<row>
<entry><para>43</para></entry>
<entry><para>Do not use \( more than 9 times.</para></entry>
<entry><para><computeroutput>Too many \(</computeroutput>.</para></entry>
</row>
<row>
<entry><para>44</para></entry>
<entry><para>Do not specify more than 2 numbers between \{ and \}.</para></entry>
<entry><para>More than two numbers given in \{ and \}.</para></entry></row>
<row>
<entry><para>45</para></entry>
<entry><para>An opening \{ must have a closing \}.</para></entry>
<entry><para>} expected after \.</para></entry></row>
<row>
<entry><para>46</para></entry>
<entry><para>The first number cannot exceed the second number between \{
and \}.</para></entry>
<entry><para>First number exceeds second in \{ and \}.</para></entry></row>
<row>
<entry><para>48</para></entry>
<entry><para>Specify a valid end point to the range.</para></entry>
<entry><para>Invalid end point in range expression.</para></entry></row>
<row>
<entry><para>49</para></entry>
<entry><para>For each [ there must be a ].</para></entry>
<entry><para><computeroutput>[ ] imbalance</computeroutput>.</para></entry>
</row>
<row>
<entry><para>50</para></entry>
<entry><para>The regular expression is too large for internal memory storage.
Simplify the regular expression.</para></entry>
<entry><para><computeroutput>Regular expression overflow</computeroutput>.
</para></entry></row></tbody></tgroup></table>
</sect1>
<sect1 id="IPG.msgs.div.10">
<title id="IPG.msgs.mkr.8">Sample Messages</title>
<para>These are examples<indexterm><primary>messages</primary><secondary>samples</secondary></indexterm> of original messages and rewritten messages.
The rewritten message follows each original message.</para>
<programlisting>AFLGKEYLTRS &ldquo;Too Many -a Keyletters (Ad9)&rdquo;
AFLGKEYLTRS &ldquo;foo: 7777-007 Use the -a flag less than 11 times.\n&rdquo;
FLGTWICE &ldquo;Flag %c Twice (Ad4)&rdquo;
FLGTWICE &ldquo;foo: 7777-004 Use the %c header flag once.\n&rdquo;
ESTAT &ldquo;can't access %s.\n&rdquo;
ESTAT &ldquo;foo: 7777-031 Cannot find or access %s.\n&rdquo;
EMODE &ldquo;foo: invalid mode\n&rdquo;
EMODE &ldquo;foo: 7777-033 A mode flag or value is not correct.\n&rdquo;
DNORG &ldquo;-d has no argument (ad1)&rdquo;
DNORG &ldquo;foo: 7777-001 Specify a parameter after the -d flag.\n&rdquo;
FLOORRNG &ldquo;floor out of range (ad23)&rdquo;
FLOORRNG &ldquo;foo: 7777-021 Specify a floor value greater than 0\n\
\tand less than 10000.\n&ldquo;
AFLGARG &ldquo;bad -a argument (ad8)&rdquo;
AFLGARG &ldquo;foo: 7777-006 Specify a user name, group name, or\n\
\tgroup number after the -a flag.\n&ldquo;
BADLISTFMT &ldquo;bad list format (ad27)&rdquo;
BADLISTFMT &ldquo;foo: 7777-025 Use numeric version and release\
\tnumbers.\n&rdquo;</programlisting>
</sect1>
</appendix>
<!--fickle 1.14 mif-to-docbook 1.7 01/02/96 04:19:51-->

View File

@@ -0,0 +1,61 @@
<!-- $XConsortium: book.sgm /main/8 1996/08/17 18:31:00 rws $ -->
<!DOCTYPE Book PUBLIC "-//HaL and O'Reilly//DTD DocBook//EN" [
<!ENTITY % ISOpublishing PUBLIC "ISO 8879-1986//ENTITIES Publishing//EN">
%ISOpublishing;
<!ENTITY % ISOnumeric PUBLIC "ISO 8879-1986//ENTITIES Numeric and Special Graphic//EN">
%ISOnumeric;
<!ENTITY % ISOdiacritical PUBLIC "ISO 8879-1986//ENTITIES Diacritical Marks//EN">
%ISOdiacritical;
<!ENTITY % ISOgeneraltech PUBLIC "ISO 8879-1986//ENTITIES General Technical//EN">
%ISOgeneraltech;
<!ENTITY % ISOalatin1 PUBLIC "ISO 8879-1986//ENTITIES Added Latin 1//EN">
%ISOalatin1;
<!ENTITY % ISOalatin2 PUBLIC "ISO 8879-1986//ENTITIES Added Latin 2//EN">
%ISOalatin2;
<!ENTITY % ISOgreek PUBLIC "ISO 8879-1986//ENTITIES Greek Symbols//EN">
%ISOgreek;
<!ENTITY % ISOboxandline PUBLIC "ISO 8879-1986//ENTITIES Box and Line Drawing//EN">
%ISOboxandline;
<!ENTITY % BEntities SYSTEM "./i18nGuide/BEntity.sgm">
%BEntities;
<!ENTITY MotifProgGd "<Emphasis>Motif Programmer's Guide</Emphasis>">
<!ENTITY Pref SYSTEM "./i18nGuide/preface.sgm">
<!ENTITY intro SYSTEM "./i18nGuide/ch01.sgm">
<!ENTITY deskt SYSTEM "./i18nGuide/ch02.sgm">
<!ENTITY distr SYSTEM "./i18nGuide/ch03.sgm">
<!ENTITY motif SYSTEM "./i18nGuide/ch04.sgm">
<!ENTITY msgs SYSTEM "./i18nGuide/appa.sgm">
]>
<!-- ____________________________________________________________________________ -->
<Book>
<Title>Common Desktop Environment: Internationalization Programmer's Guide</Title>
&Pref;
&intro;
&deskt;
&distr;
&motif;
&msgs;
</Book>

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -0,0 +1,716 @@
<!-- $XConsortium: ch01.sgm /main/10 1996/09/08 19:38:50 rws $ -->
<!-- (c) Copyright 1995 Digital Equipment Corporation. -->
<!-- (c) Copyright 1995 Hewlett-Packard Company. -->
<!-- (c) Copyright 1995 International Business Machines Corp. -->
<!-- (c) Copyright 1995 Sun Microsystems, Inc. -->
<!-- (c) Copyright 1995 Novell, Inc. -->
<!-- (c) Copyright 1995 FUJITSU LIMITED. -->
<!-- (c) Copyright 1995 Hitachi. -->
<chapter id="IPG.intro.div.1">
<title id="IPG.intro.mkr.1">Introduction to Internationalization</title>
<para>Internationalization<indexterm><primary>internationalization</primary>
<secondary>definition</secondary></indexterm> is the designing of computer
systems and applications for users around the world. Such users have different
languages and may have different requirements for the functionality and user
interface of the systems they operate. In spite of these differences, users
want to be able to implement enterprise-wide applications that run at their
sites worldwide. These<indexterm><primary>application requirements</primary>
</indexterm> applications must be able to <emphasis>interoperate</emphasis>
across country boundaries, run on a variety of hardware configurations from
multiple vendors, and be localized to meet local users' needs. This open,
distributed computing environment is the reasoning behind<indexterm><primary>Common Desktop Environment</primary><secondary>description</secondary></indexterm> common
open software environments. The internationalization technology identified
within this specification provides these benefits to a global market.</para>
<sect1 id="IPG.intro.div.2">
<title id="IPG.intro.mkr.2">Overview of Internationalization</title>
<para>Multiple environments may exist within a common open system for support
of different national languages. Each of these national<indexterm><primary>locales</primary><secondary>definition</secondary></indexterm> environments
is called a <emphasis>locale,</emphasis> which considers the language, its
characters, fonts, and the customs used to input and format data. The Common
Desktop Environment is fully internationalized such that any application
can run using any locale installed in the system.</para>
<para>A locale defines the behavior of a program at run time according to
the language and cultural conventions of a user's geographical area. Throughout
the system,<indexterm><primary>locales</primary><secondary>behavior</secondary>
</indexterm> locales affect the following:</para>
<itemizedlist remap="Bullet1"><listitem><para>Encoding and processing of text
data</para>
</listitem><listitem><para>Identifying the language and encoding of resource
files and their text values</para>
</listitem><listitem><para>Rendering and layout of text strings</para>
</listitem><listitem><para>Interchanging text that is used for interclient
text communication</para>
</listitem><listitem><para>Selecting the input method (which code set will
be generated) and the processing of text data</para>
</listitem><listitem><para>Encoding and decoding for interclient text communication
</para>
</listitem><listitem><para>Bitmap/icon files</para>
</listitem><listitem><para>Actions and file types</para>
</listitem><listitem><para>User Interface Definition (UID) files</para>
</listitem></itemizedlist>
<para>An internationalized application contains no code that is dependent
on the user's locale, the characters needed to represent that locale, or
any formats (such as date and currency) that the user expects to see and
interact with. The desktop accomplishes this by separating language- and
culture-dependent information from the application and saving it outside
the application.</para>
<para>Figure&numsp;1-1 shows the kinds of information that should be external
to an application to simplify internationalization.</para>
<figure>
<title id="IPG.intro.mkr.3">Information external to the application</title>
<graphic id="IPG.intro.grph.1" entityref="IPG.intro.fig.1"></graphic>
</figure>
<para>By keeping the language- and culture-dependent information separate
from the application source code, the application does not need to be rewritten
or recompiled to be marketed in different countries. Instead, the only requirement
is for the external information to be localized to accommodate local language
and customs.</para>
<para>An internationalized application is also adaptable to the requirements
of different native languages, local customs, and character-string encodings.
The process of adapting the operation to a particular native language, local
custom, or string encoding is called<indexterm><primary>localization</primary>
<secondary>definition</secondary></indexterm> <emphasis>localization</emphasis>.
A<indexterm><primary>internationalization</primary><secondary>goals of</secondary>
</indexterm> goal of internationalization is to permit localization without
program source modifications or recompilation.</para>
<para>For a quick overview of internationalization, refer to <emphasis>X/Open
CAE Specification System Interface Definition</emphasis>, Issue 4, X/Open
Company Ltd., 1992, ISBN: 1- 872630-46-4.</para>
<sect2 id="IPG.intro.div.3">
<title>Current State of Internationalization</title>
<para>Previously, the industry supplied many variants of internationalization
from proprietary functions to the new set of standard functions published
by X/Open. Also, there have been different levels of enabling, such as simple
ASCII support, Latin/European support, Asian multibyte support, and Arabic/Hebrew
bidirectional support.</para>
<para>The interfaces defined<indexterm><primary>internationalization</primary>
<secondary>supported languages</secondary></indexterm> within the X/Open specification
are capable of supporting a large set of languages and territories,<indexterm>
<primary>languages</primary></indexterm> including:</para>
<informaltable>
<tgroup cols="2" colsep="0" rowsep="0">
<?PubTbl tgroup dispwid="4.29in">
<colspec align="left" colwidth="123*">
<colspec align="left" colwidth="285*">
<thead>
<row><entry align="left" valign="bottom"><para><literal>Script</literal></para></entry>
<entry align="left" valign="top">Description</entry></row></thead>
<tbody>
<row>
<entry align="left" valign="top"><para>Latin Language</para></entry>
<entry align="left" valign="top"><para>Americas, Eastern/Western European
</para></entry></row>
<row>
<entry align="left" valign="top"><para>Greek</para></entry>
<entry align="left" valign="top"><para>Greece</para></entry></row>
<row>
<entry align="left" valign="top"><para>Turkish</para></entry>
<entry align="left" valign="top"><para>Turkey</para></entry></row>
<row>
<entry align="left" valign="top"><para>East Asia</para></entry>
<entry align="left" valign="top"><para>Japanese, Korean, and Chinese</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>Indic</para></entry>
<entry align="left" valign="top"><para>Thai</para></entry></row>
<row>
<entry align="left" valign="top"><para>Bidirectional</para></entry>
<entry align="left" valign="top"><para>Arabic and Hebrew</para></entry></row>
</tbody></tgroup></informaltable>
<para>Furthermore, the<indexterm><primary>Common Desktop Environment</primary>
<secondary>goal of</secondary></indexterm> goal of the Common Desktop Environment
is that localization of these technologies (translation of messages and documentation
and other adaptation for local needs) be done in a consistent way, so that
a supported user anywhere in the world will find the same <emphasis>common
localized environment</emphasis> from vendor to vendor.<indexterm><primary>localization</primary><secondary>results of</secondary></indexterm> End users
and administrators can expect a consistent set of localization features that
provide a complete application environment for support of global software.
</para>
</sect2>
<sect2 id="IPG.intro.div.4">
<title id="IPG.intro.mkr.4">Internationalization Standards</title>
<para><indexterm><primary>standards</primary></indexterm>Through the work
of many companies, the functionality of the internationalization application
program interface has been standardized over time to include additional requirements
and languages, particularly those of East Asia. This work has been centered
primarily in the Portable Operating System Interface for Computer Environments
(POSIX) and<indexterm><primary>X/Open specifications</primary></indexterm> X/Open
specifications. The original X/Open specification was published in the second
edition of the <emphasis>X/Open Portability Guide</emphasis> (XPG2) and was
based on the Native Language Support product released by Hewlett-Packard.
The latest published X/Open internationalization standard is referred to
as XPG4.</para>
<para>It is important that each layer within the desktop use the proper set
of<indexterm><primary>standards</primary></indexterm> standards interfaces
defined for internationalization to ensure end users get a consistent, localized
interface. The definition of a locale and the common open set of locale-dependent
functions are based on the following<indexterm><primary>internationalization</primary><secondary>specifications</secondary></indexterm> specifications:
</para>
<itemizedlist remap="Bullet1"><listitem><para><emphasis>X Window System, The
Complete Reference to Xlib, Xprotocol, ICCCM, XLFD - X Version, Release 5</emphasis>, Digital Press, 1992, ISBN 1-55558-088-2.</para>
</listitem><listitem><para><emphasis>ANSI/IEEE Standard Portable Operating
System Interface for Computer Environments</emphasis>, IEEE.</para>
</listitem><listitem><para><emphasis>OSF</emphasis>&trade; <emphasis>Motif
1.2 Programmer' Reference, Revision 1.2</emphasis>, Open Software Foundation,
Prentice Hall, 1992, ISBN 0-13-643115-1.</para>
</listitem><listitem><para><emphasis>X/Open CAE Specification Commands and
Utilities</emphasis>, Issue 4, X/Open Company Ltd., 1992, ISBN 1-872630-48-0.
</para>
</listitem></itemizedlist>
<para><indexterm><primary>standard interfaces, benefit of using</primary>
</indexterm>Within this environment, software developers can expect to develop <emphasis>worldwide applications</emphasis> that are portable, can interoperate across
distributed systems (even from different vendors), and can meet the diverse
language and cultural requirements of multinational users supported by the
desktop standard locales.</para>
</sect2>
<sect2 id="IPG.intro.div.5">
<title>Common Internationalization System</title>
<para><!--Original XRef content: 'Figure&numsp;1&hyphen;2 on page&numsp;6'--><xref
role="CodeOrFigOrTabAndPNum" linkend="IPG.intro.mkr.5"> shows a view of how<indexterm><primary>internationalization</primary><secondary>common system</secondary></indexterm>
internationalization is pervasive across a specific
single-host system. The goal is that the applications (<emphasis>clients</emphasis>)
are built to be shipped worldwide for the set of locales supported in the
underlying system. Using standard interfaces improves access to global markets
and minimizes the amount of localization work needed by application developers.
In addition, country representatives can be ensured of consistent localization
within systems adhering to the principles of the desktop.</para>
<figure>
<title id="IPG.intro.mkr.5">Common internationalized system</title>
<graphic id="IPG.intro.grph.2" entityref="IPG.intro.fig.2"></graphic>
</figure>
</sect2>
</sect1>
<sect1 id="IPG.intro.div.6">
<title id="IPG.intro.mkr.6">Locales<indexterm><primary>Common Desktop Environment</primary><secondary>National Language Support</secondary><tertiary>using
locales</tertiary></indexterm><indexterm><primary>Common Desktop Environment</primary><secondary>National Language Support</secondary><tertiary>setlocale
function</tertiary></indexterm><indexterm><primary>setlocale function</primary>
<secondary>for internationalization</secondary></indexterm></title>
<para>Most single-display clients operate in a single locale that is determined
at run time from the setting of the environment variable, which is usually <computeroutput>$LANG</computeroutput> or the <computeroutput>xnlLanguage</computeroutput>
resource. Locale environment variables, such as <filename>LC_ALL</filename>,
<computeroutput>LC_CTYPE</computeroutput>, and <computeroutput>LANG</computeroutput>,
can be used to control the environment.
</para>
<para>The <computeroutput>LC_CTYPE</computeroutput> category of the locale
is used by the environment to identify the locale-specific features used
at run time. The fonts and input method loaded by the toolkit are determined
by the <computeroutput>LC_CTYPE</computeroutput> category.</para>
<para>Programs that are enabled for internationalization are expected to call
the <computeroutput>XtSetLanguageProc()</computeroutput> function (which
calls <computeroutput>setlocale()</computeroutput> by default) to set the
locale desired by the user. None of the libraries call the <computeroutput>setlocale()</computeroutput> function to set the locale, so it is the responsibility
of the application to call <computeroutput>XtSetLanguageProc()</computeroutput>
with either a specific locale or some value loaded at run time. If applications
are internationalized and do not use <computeroutput>XtSetLanguageProc()</computeroutput>, obtain the locale name from one of the following prioritized
sources to pass it to the <computeroutput>setlocale()</computeroutput> function:</para>
<itemizedlist remap="Bullet1"><listitem><para>A command-line option</para>
</listitem><listitem><para>A resource</para>
</listitem><listitem><para>The empty string (&ldquo; &rdquo;)</para>
</listitem></itemizedlist>
<para>The empty string makes the <computeroutput>setlocale()</computeroutput>
function use the <filename>$LC_*</filename> and <computeroutput>$LANG</computeroutput>
environment variables to determine locale settings. Specifically, setlocale
(<computeroutput>LC_ALL</computeroutput>, &ldquo; &rdquo;) specifies that
the locale should be checked and taken from environment variables in the
order shown in Table 1-1 for the various locale categories.</para>
<table id="IPG.intro.tbl.1" frame="Topbot">
<title>Locale Categories</title>
<tgroup cols="4" colsep="0" rowsep="0">
<colspec colwidth="1.68in">
<colspec colwidth="1.20in">
<colspec colwidth="1.46in">
<colspec colwidth="1.55in">
<thead>
<row><entry align="left" valign="bottom"><para><literal>Category</literal></para></entry>
<entry align="left" valign="bottom"><para><literal>1st Env. Var.</literal></para></entry>
<entry align="left" valign="bottom"><para><literal>2nd Env. Var.</literal></para></entry>
<entry align="left" valign="bottom"><para><literal>3rd Env. Var.</literal></para></entry>
</row></thead>
<tbody>
<row>
<entry align="left" valign="top"><para>LC_CTYPE:</para></entry>
<entry align="left" valign="top"><para>LC_ALL</para></entry>
<entry align="left" valign="top"><para>LC_TYPE</para></entry>
<entry align="left" valign="top"><para>LANG</para></entry></row>
<row>
<entry align="left" valign="top"><para>LC_COLLATE:</para></entry>
<entry align="left" valign="top"><para>LC_ALL</para></entry>
<entry align="left" valign="top"><para>LC_COLLATE</para></entry>
<entry align="left" valign="top"><para>LANG</para></entry></row>
<row>
<entry align="left" valign="top"><para>LC_TIME:</para></entry>
<entry align="left" valign="top"><para>LC_ALL</para></entry>
<entry align="left" valign="top"><para>LC_TIME</para></entry>
<entry align="left" valign="top"><para>LANG</para></entry></row>
<row>
<entry align="left" valign="top"><para>LC_NUMERIC:</para></entry>
<entry align="left" valign="top"><para>LC_ALL</para></entry>
<entry align="left" valign="top"><para>LC_NUMERIC</para></entry>
<entry align="left" valign="top"><para>LANG</para></entry></row>
<row>
<entry align="left" valign="top"><para>LC_MONETARY:</para></entry>
<entry align="left" valign="top"><para>LC_ALL</para></entry>
<entry align="left" valign="top"><para>LC_MONETARY</para></entry>
<entry align="left" valign="top"><para>LANG</para></entry></row>
<row>
<entry align="left" valign="top"><para>LC_MESSAGES:</para></entry>
<entry align="left" valign="top"><para>LC_ALL</para></entry>
<entry align="left" valign="top"><para>LC_MESSAGES</para></entry>
<entry align="left" valign="top"><para>LANG</para></entry></row></tbody></tgroup>
</table>
<para>The toolkit already defines a standard command-line option ( <command>-lang</command>) and a resource (<systemitem>xnlLanguage</systemitem>). Also,
the resource value can be set in the server <filename>RESOURCE_MANAGER</filename>,
which may affect all clients that connect to that server.</para>
</sect1>
<sect1 id="IPG.intro.div.7">
<title id="IPG.intro.mkr.7">Fonts, Font Sets, and Render Tables<indexterm>
<primary>National Language Support</primary><secondary>understanding</secondary>
<tertiary>fonts</tertiary></indexterm><indexterm><primary>National Language
Support</primary><secondary>understanding</secondary><tertiary>font sets</tertiary>
</indexterm><indexterm><primary>National Language Support</primary><secondary>font sets</secondary></indexterm><indexterm><primary>National Language Support</primary><secondary>fonts</secondary></indexterm><indexterm><primary>National
Language Support</primary><secondary>understanding</secondary><tertiary>render
tables</tertiary></indexterm><indexterm><primary>National Language Support</primary><secondary>render tables</secondary></indexterm></title>
<para>All X clients use fonts for drawing text. The basic object used in drawing
text is <command>XFontStruct</command>, which identifies the font that contains
the images to be drawn.</para>
<para>The desktop already supports fonts by way of the <computeroutput>XFontStruct</computeroutput> data structure defined by Xlib; yet, the encoding of the
characters within the font must be known to an internationalized application.
To communicate this information, the program expects that all fonts at the
server are identified by an X Logical Font Description (XLFD) name. The XLFD
name enables users to describe both the base characteristics and the charset
(encoding of font glyphs). The term <symbol role="Variable">charset</symbol>
is used to denote the encoding of glyphs within the font, while the term <emphasis>code set</emphasis> means the encoding of characters within the locale. The
charset for a given font is determined by the CharSetRegistry and CharSetEncoding
fields of the XLFD name. Text and symbols are drawn as defined by the codes
in the fonts.</para>
<para>A <emphasis>font set</emphasis> (for example, an <computeroutput>XFontSet</computeroutput> data structure defined by Xlib) is a collection of one
or more fonts that enables all characters defined for a given locale to be
drawn. Internationalized applications may be required to draw text encoded
in the code sets of the locale where the value of an encoded character is
not identical to the glyph index. Additionally, multiple fonts may be required
to render all characters of the locale using one or more fonts whose encodings
may be different than the code set of the locale. Since both code sets and
charsets may vary from locale to locale, the concept of a font set is introduced
through <computeroutput>XFontSet</computeroutput>.</para>
<para>While fonts are identified by their XLFD name, font sets are identified
by a list of XLFD names. The list can consist of one or more XLFD names with
the exception that only the base characteristics are significant; the encoding
of the desired fonts is determined from the locale. Any charsets specified
in the XLFD base name list are ignored and users need only concentrate on
specifying the base characteristics, such as point size, style, and weight.
A font set is said to be <emphasis>locale-sensitive</emphasis> and is used
to draw text that is encoded in the code set of the locale. Internationalized
applications should use font sets instead of font structs to render text
data.</para>
<para>Render tables are collections of renditions that specify how text is
to be rendered. They are summarized elsewhere in this section.</para>
<sect2 id="IPG.intro.div.8">
<title>Font Specification</title>
<para>The <emphasis>font specification</emphasis> can be either an X Logical
Function Description (XLFD) name or an alias for the XLFD name. For example,
the following are valid font specifications for a 14-point font:</para>
<programlisting>-dt-application-medium-r-normal-serif-*-*-*-*-p-*-iso8859-1
</programlisting>
<para>OR</para>
<programlisting>-*-r-*-14-*iso8859-1</programlisting>
</sect2>
<sect2 id="IPG.intro.div.9">
<title>Font Set Specification<indexterm><primary>font sets</primary><secondary>internationalizing</secondary></indexterm></title>
<para>The <emphasis>font set specification</emphasis> is a list of names (XLFD
names or their aliases) and is sometimes called a <emphasis>base name list</emphasis>. All names are separated by commas, with any blank spaces before
or after the comma being ignored. Pattern-matching (wildcard) characters
can be specified to help shorten XLFD names.</para>
<para>Remember that a font set specification is determined by the locale that
is running. For example, the ja_JP Japanese locale defines three fonts (character
sets) necessary to display all of its characters; the following identifies
the set of Gothic fonts needed.</para>
<itemizedlist remap="Bullet1"><listitem><para>Example of full XLFD name list:
</para>
<programlisting>-dt-mincho-medium-r-normal--14-*-*-m-*-jisx0201.1976-0,-dt-mincho-medium-r-normal--28-*-*-*-m-*-jisx0208.1983-0:</programlisting>
</listitem><listitem><para>Example of single XLFD pattern name:</para>
<programlisting>-dt-*-medium-*-24-*-m-*:</programlisting>
</listitem></itemizedlist>
<para>The preceding two cases can be used with a Japanese locale as long as
fonts exist that match the base name list.</para>
</sect2>
<sect2 id="IPG.intro.div.10">
<title id="IPG.intro.mkr.8">Base Font Name List Specification<indexterm>
<primary>font sets</primary><secondary>specifying base name list</secondary>
</indexterm><indexterm><primary>National Language Support</primary><secondary>specifying</secondary><tertiary>base name list</tertiary></indexterm><indexterm>
<primary>internationalization</primary><secondary>specifying base name lists</secondary></indexterm></title>
<para>The <emphasis>base font name list</emphasis> is a list of base font
names associated with a font set as defined by the locale. The base font
names are in a comma-separated list and are assumed to be characters from
the portable character set; otherwise, the result is undefined. Blank space
immediately on either side of a separating comma is ignored.<indexterm>
<primary>base font name list</primary></indexterm><indexterm><primary>base
font name list</primary></indexterm></para>
<para>Use of XLFD font names permits international applications to obtain
the fonts needed for a variety of locales from a single locale-independent
base font name. The single base font name specifies a family of fonts whose
members are encoded in the various charsets needed by the locales of interest.<indexterm>
<primary>X Logical Font Description (XLFD)</primary><secondary>font names
for international locale</secondary></indexterm></para>
<para>An XLFD base font name can explicitly name the font's charset needed
for the locale. This enables the user to specify an exact font for use with
a charset required by a locale, fully controlling the font selection.</para>
<para>If a base font name is not an XLFD name, an attempt is made to obtain
an XLFD name from the font properties for the font.</para>
<para>The following algorithm is used to select the fonts that are used to
display text with font sets.<indexterm><primary>font selection algorithm,
displaying text with font sets</primary></indexterm><indexterm><primary>base font name list</primary></indexterm></para>
<para>For each charset required by the locale, the base font name list is
searched for the first of the following cases that names a set of fonts that
exist at the server.</para>
<itemizedlist remap="Bullet1"><listitem><para>The first XLFD-conforming base
font name that specifies the required charset or a superset of the required
charset in its CharSetRegistry and CharSetEncoding fields.</para>
</listitem><listitem><para>The first set of one or more XLFD-conforming base
font names that specify one or more charsets that can be remapped to support
the required charset. The Xlib implementation can recognize various mappings
from a required charset to one or more other charsets and use the fonts for
those charsets. For example, JIS Roman is ASCII with the ~ (tilde) and \
(backslash) characters replaced by the yen and overbar characters; Xlib can
load an ISO8859-1 font to support this character set if a JIS Roman font
is not available.</para>
</listitem><listitem><para>The first XLFD-conforming font name, or the first
non-XLFD font name for which an XLFD font name can be obtained, combined
with the required charset (replacing the CharSetRegistry and CharSetEncoding
fields in the XLFD font name). In the first instance, the implementation
can use a charset that is a superset of the required charset.</para>
</listitem><listitem><para>The first font name that can be mapped in some
locale-dependent manner to one or more fonts that support imaging text in
the charset.</para>
</listitem></itemizedlist>
<para>For example, assume a locale requires the following charsets:</para>
<itemizedlist remap="Bullet1"><listitem><para>ISO8859-1</para>
</listitem><listitem><para>JISX0208.1983</para>
</listitem><listitem><para>JISX0201.1976</para>
</listitem><listitem><para>GB2312-1980.0</para>
</listitem></itemizedlist>
<para>You can supply a base font name list that explicitly specifies the charsets,
ensuring that specific fonts are used if they exist, as shown in the following
example:</para>
<programlisting>&ldquo;-dt-mincho-Medium-R-Normal-*-*-*-*-*-M-*-JISX0208.1983-0,\
-dt-mincho-Medium-R-Normal-*-*-*-*-*-M- \
*-JISX0201.jisx0201\.1976-1,\
-dt-song-Medium-R-Normal-*-*-*-*-*-M-*-GB2312-1980.0,\
-*-default-Bold-R-Normal-*-*-*-*-M-*-ISO8859-1&ldquo;</programlisting>
<para>You can supply a base font name list that omits the charsets, which
selects fonts for each required code set, as shown in the following example:
</para>
<programlisting>&ldquo;-dt-Fixed-Medium-R-Normal-*-*-*-*-*-M-*,\
-dt-Fixed-Medium-R-Normal-*-*-*-*-*-M-*,\
-dt-Fixed-Medium-R-Normal-*-*-*-*-*-M-*,\
-*-Courier-Bold-R-Normal-*-*-*-*-M-*&rdquo;</programlisting>
<para>Alternatively, the user can supply a single base font name that selects
from all available fonts that meet certain minimum XLFD property requirements,
as shown in the following example:</para>
<programlisting>&ldquo;-*-*-*-R-Normal--*-*-*-*-*-M-*&rdquo;</programlisting>
</sect2>
<sect2 id="IPG.intro.div.12">
<title>Render Tables<indexterm><primary>render tables</primary><secondary>internationalizing</secondary></indexterm></title>
<para>A <emphasis>render table</emphasis> consists of one or more entries
called <emphasis>renditions</emphasis>.<indexterm><primary>rendition</primary>
</indexterm> Each rendition is tagged with a name that is used when drawing
a compound string. For internationalized applications, renditions and render
tables should be specified in resource files to ensure the independence of
application binaries from the different needs of various locales. For detailed
information about renditions and render tables, refer to &MotifProgGd;.</para>
</sect2>
</sect1>
<sect1 id="IPG.intro.div.13">
<title id="IPG.intro.mkr.9">Text Drawing</title>
<para>The desktop provides various functions for rendering localized text,
including simple text, compound strings, and some widgets. These include
functions within the Xlib and Motif libraries.</para>
</sect1>
<sect1 id="IPG.intro.div.14">
<title id="IPG.intro.mkr.10">Input Methods<indexterm><primary>National Language
Support</primary><secondary>entering input</secondary></indexterm><indexterm>
<primary>National Language Support</primary><secondary>using input methods</secondary></indexterm><indexterm><primary>Common Desktop Environment</primary>
<secondary>National Language Support</secondary><tertiary>input areas</tertiary>
</indexterm><indexterm><primary>National Language Support</primary><secondary>input areas</secondary></indexterm><indexterm><primary>Common Desktop Environment</primary><secondary>input area</secondary><tertiary>details of</tertiary>
</indexterm></title>
<para>The Common Desktop Environment provides the ability to enter localized
input for an internationalized application that is using the Motif Toolkit.
Specifically, the <computeroutput>XmText[Field]</computeroutput> widgets
are enabled to interface with input methods provided by each locale. In addition,
the <computeroutput>dtterm</computeroutput> client is enabled to use input
methods. For detailed information on input methods, refer to <citetitle>Motif
Programmer's Guide</citetitle>.</para>
<para>By default, each internationalized client that uses the Motif Toolkit
uses the input method associated with a locale specified by the user. The
<systemitem class="Resource">XmNinputMethod</systemitem> resource is provided
as the input method portion of the
locale modifiers to allow a user to specify any alternative
input method.</para>
<para>The user interface of the input method consists of several elements.
The need for these areas is dependent on the input method being used. They
are usually needed by input methods that require complex input processing
and dialogs. See Figure&numsp;1-3 for an illustration of these areas.</para>
<figure>
<title id="IPG.intro.mkr.11">Example of VendorShell widget with auxiliary
(Japanese)</title>
<graphic id="IPG.intro.grph.3" entityref="IPG.intro.fig.3"></graphic>
</figure>
<para>The <classname>VendorShell</classname> may contain the <systemitem class="Resource">XmNinputPolicy</systemitem> resource.<indexterm><primary>VendorShell</primary>
<secondary>input policy</secondary></indexterm> This dictates whether its
children widgets share input contexts or not. When using a root window input
method style,
for example, the input context would probably be shared by several widgets, while in an off-the-spot
input method style, the input context might be shared between more than one widget, although it might
not. However, with an over-the-spot input method style, an input context would almost certainly
belong to a single widget. The possible values of <systemitem class="Resource">XmNinputPolicy</systemitem> are <systemitem class="Constant">XmPER_WIDGET</systemitem>, which will provide a new input context for each widget, and <systemitem class="Constant">XmPER_SHELL</systemitem>, which will cause the children widgets
of a common shell to share a single input context.
</para>
<sect2 id="IPG.intro.div.15">
<title id="IPG.intro.mkr.12">Preedit Area<indexterm><primary>Common Desktop
Environment</primary><secondary>input area</secondary><tertiary>preedit area</tertiary></indexterm><indexterm><primary>preedit areas</primary><secondary>description</secondary></indexterm><indexterm><primary>VendorShell widget
class</primary><secondary>preedit area</secondary></indexterm><indexterm>
<primary>preedit areas</primary><secondary>VendorShell widget class</secondary>
</indexterm></title>
<para>A preedit area is used to display the string being preedited. An input
method can support the following modes of preediting: OffTheSpot, OnTheSpot (default)
OverTheSpot, Root, and None.</para>
<note>
<para>A string that has been committed <emphasis>cannot</emphasis> be reconverted.
The status of the string is moved from the preedit area to the location where
the user is entering characters.<indexterm><primary>Japanese Input Method</primary><secondary>preediting, reconverted strings</secondary></indexterm></para>
</note>
<sect3 id="IPG.intro.div.16">
<title>OffTheSpot<indexterm><primary>preedit areas</primary><secondary>Off &lt;#1e>The &lt;#1e>Spot mode</secondary></indexterm><indexterm><primary>Off &lt;#1e>The &lt;#1e> Spot mode, preedit area</primary></indexterm><indexterm><primary>modes of preediting</primary><secondary>OffTheSpot</secondary></indexterm></title>
<para>In OffTheSpot mode preediting using an input method, the location of
preediting is usually inside the application window
on the right side
of the status area as shown in <!--Original XRef content:
'Figure&numsp;1&hyphen;4'--><xref role="CodeOrFigureOrTable" linkend="IPG.intro.mkr.13">.
A Japanese input method is used for the example.</para>
<figure>
<title id="IPG.intro.mkr.13">Example of OffTheSpot preediting with the VendorShell
widget (Japanese)</title>
<graphic id="IPG.intro.grph.4" entityref="IPG.intro.fig.4"></graphic>
</figure>
<para>When preediting using an input method, the
preedit string being preedited may be highlighted in some form depending
on the input method.</para>
<para>To use OffTheSpot mode, set the <systemitem>XmNpreeditType</systemitem>
resource of the <computeroutput>VendorShell</computeroutput> widget either
with the <computeroutput>XtSetValues()</computeroutput> function or with a
resource file. The <systemitem>XmNpreeditType</systemitem> resource can also
be set as the resource of a <computeroutput>TopLevelShell</computeroutput>, <computeroutput>ApplicationShell</computeroutput>, or <computeroutput>DialogShell</computeroutput>
widget, all of which are subclasses of the <computeroutput>VendorShell</computeroutput>
widget class.</para>
</sect3>
<sect3 id="IPG.intro.div.17">
<title>OverTheSpot<indexterm><primary>preedit areas</primary><secondary>OverTheSpot mode</secondary></indexterm><indexterm><primary>preedit areas</primary><secondary>default mode</secondary></indexterm><indexterm><primary>OverTheSpot mode, preedit area</primary></indexterm><indexterm><primary>modes of preediting</primary><secondary>OverTheSpot</secondary></indexterm></title>
<para>In OverTheSpot mode, the location of the preedit
area is set to where the user is trying to enter characters (for example,
the insert cursor position of the <computeroutput>Text</computeroutput> widget
that has the current focus). The characters in a preedit area are displayed
at the cursor position as an overlay window, and they can be highlighted
depending on the input method.</para>
<para>Although a preedit area may consist of multiple lines in OverTheSpot mode. The preedit area is always within the MainWindow area and
cannot cross its edges in any direction.</para>
<para>Keep in mind that although the preedit string under construction may
be displayed as though it were part of the <computeroutput>Text</computeroutput>
widget's text, it is not passed to the client and displayed in the underlying
edit screen until preedit ends. See <!--Original XRef content: 'Figure&numsp;1&hyphen;5
on page&numsp;17'--><xref role="CodeOrFigOrTabAndPNum" linkend="IPG.intro.mkr.14">
for an illustration.</para>
<para>To use OverTheSpot mode explicitly, set the <systemitem>XmNpreeditType</systemitem> resource of the <computeroutput>VendorShell</computeroutput>
widget either with the <computeroutput>XtSetValues()</computeroutput> function
or with a resource file. The <systemitem>XmNpreeditType</systemitem> resource
can be set as the resource of a <computeroutput>TopLevelShell</computeroutput>, <computeroutput>ApplicationShell</computeroutput>, or <computeroutput>DialogShell</computeroutput>
widget because these are subclasses of the <computeroutput>VendorShell</computeroutput>
widget class.</para>
<figure>
<title id="IPG.intro.mkr.14">Example of OverTheSpot preediting with the VendorShell
widget (Japanese)</title>
<graphic id="IPG.intro.grph.5" entityref="IPG.intro.fig.5"></graphic>
</figure>
</sect3>
<sect3 id="IPG.intro.div.17a">
<title>OnTheSpot (Default)</title>
<indexterm><primary>preedit areas</primary><secondary>OnTheSpot</secondary>
</indexterm><indexterm><primary>OnTheSpot mode, preedit area</primary></indexterm><indexterm><primary>modes of preediting</primary><secondary>OnTheSpot</secondary>
</indexterm>
<para>In OnTheSpot mode, the preedit string is displayed in the text widget
window. The preedit string is considered part of the text widget value, and
its integrity can be ensured by the verify callbacks of the text widget (the
verify callbacks are controlled by the
<systemitem class="resource">verifyPreedit</systemitem> resource, which defaults
to <literal>False</literal>).
If the
verify callbacks of the text widget do not accept any part of the preedit
buffer, the preedit string is committed (for information on user actions that
cause the preedit string to be committed, refer to &MotifProgGd;).</para>
<para>When preediting using an input method, the
preedit string being preedited may be highlighted in some form depending
on the input method.</para>
<para>To use OnTheSpot mode, set the <systemitem>XmNpreeditType</systemitem>
resource of the <computeroutput>VendorShell</computeroutput> widget either
with the <computeroutput>XtSetValues()</computeroutput> function or with a
resource file. The <systemitem>XmNpreeditType</systemitem> resource can also
be set as the resource of a <computeroutput>TopLevelShell,</computeroutput> <computeroutput>ApplicationShell</computeroutput>, or <computeroutput>DialogShell</computeroutput>
widget, all of which are subclasses of the <computeroutput>VendorShell</computeroutput>
widget class.</para>
</sect3>
<sect3 id="IPG.intro.div.18">
<title>Root<indexterm><primary>preedit areas</primary><secondary>Root mode</secondary></indexterm><indexterm><primary>Root mode, preedit area</primary>
</indexterm><indexterm><primary>modes of preediting</primary><secondary>Root</secondary></indexterm></title>
<para>In Root mode, the preedit and status areas are located separate from
the client's window. The Root mode behavior is similar to OffTheSpot. See
<!--Original XRef content: 'Figure&numsp;1&hyphen;6'--><xref role="CodeOrFigureOrTable"
linkend="IPG.intro.mkr.15"> for an illustration.</para>
<figure>
<title id="IPG.intro.mkr.15">Example of Root preediting with the VendorShell
widget (Japanese)</title>
<graphic id="IPG.intro.grph.6" entityref="IPG.intro.fig.6"></graphic>
</figure>
</sect3>
</sect2>
<sect2 id="IPG.intro.div.19">
<title id="IPG.intro.mkr.16">Status Area<indexterm><primary>Common Desktop
Environment</primary><secondary>input area</secondary><tertiary>status area</tertiary></indexterm><indexterm><primary>status area</primary></indexterm><indexterm>
<primary>VendorShell widget class</primary><secondary>status area</secondary>
</indexterm></title>
<para>A status area reports the input or keyboard status
of the input method to the users. For OverTheSpot and OffTheSpot styles,
the status area is located at the lower left corner of the VendorShell window.
</para>
<itemizedlist remap="Bullet1"><listitem><para>If Root style, the status area
is placed outside the client window.</para>
</listitem><listitem><para>If the preedit style is OffTheSpot mode, the preedit
area is displayed to the right of the status area.</para>
</listitem></itemizedlist>
<para>The <computeroutput>VendorShell</computeroutput> widget provides geometry
management so that a status area is rearranged at the bottom corner of the
VendorShell window.</para>
</sect2>
<sect2 id="IPG.intro.div.20">
<title id="IPG.intro.mkr.17">Auxiliary Area<indexterm><primary>Common Desktop
Environment</primary><secondary>input area</secondary><tertiary>auxiliary
area</tertiary></indexterm><indexterm><primary>auxiliary area</primary></indexterm><indexterm>
<primary>VendorShell widget class</primary><secondary>auxiliary area</secondary>
</indexterm></title>
<para>An auxiliary area helps the user with preediting. Depending on the particular
input method, an auxiliary area can be created. The Japanese input method
in <!--Original XRef content: 'Figure&numsp;1&hyphen;3 on page&numsp;14'--><xref
role="CodeOrFigOrTabAndPNum" linkend="IPG.intro.mkr.11"> creates the following
types of auxiliary areas:<indexterm><primary>auxiliary
area</primary></indexterm><indexterm><primary>Japanese Input Method</primary>
<secondary>auxiliary area</secondary></indexterm></para>
<itemizedlist remap="Bullet1"><listitem><para>ZENKOUHO</para>
</listitem><listitem><para>JIS NUMBER</para>
</listitem><listitem><para>Switching conversion method</para>
<itemizedlist remap="Bullet2"><listitem><para>SAKIYOMI-REN-BUNSETSU</para>
</listitem><listitem><para>IKKATSU-REN-BUNSETSU</para>
</listitem><listitem><para>TAN-BUNSETSU</para>
</listitem><listitem><para>FUKUGOU-GO</para>
</listitem></itemizedlist>
</listitem></itemizedlist>
</sect2>
<sect2 id="IPG.intro.div.22">
<title id="IPG.intro.mkr.19">Focus Area<indexterm><primary>Common Desktop
Environment</primary><secondary>input area</secondary><tertiary>focus area</tertiary></indexterm><indexterm><primary>focus area</primary></indexterm><indexterm>
<primary>VendorShell widget class</primary><secondary>focus area</secondary>
</indexterm><indexterm><primary>focus management</primary><secondary>focus
area</secondary></indexterm><indexterm><primary>focus area</primary></indexterm></title>
<para>A focus area is any descendant widget under the <computeroutput>VendorShell</computeroutput> widget subtree that currently has focus. The Motif application
programmer using existing widgets does not need to worry about the focus area. The important information to remember is that only one widget
can have input method processing at a time. The input method processing moves
to the window (widget) that currently has the focus.</para>
</sect2>
<sect2 id="IPG.intro.div.22a">
<title>Layout Direction</title>
<indexterm><primary>layout direction</primary></indexterm>
<para>Layout direction refers to the direction that is used to display visual
elements such as widget children, widget components, and text (controlled
by the <classname>VendorShell</classname> resource, <systemitem class="resource">XmNlayoutDirection</systemitem>). In general, this direction matches the direction
that people use when reading or writing in a particular language. Languages
such as English, French, German, and Swedish are read and written from left
to right. Therefore, when users working in those languages enter characters
from a computer keyboard, each new character is displayed to the right of
the preceding one. These same users would also expect the layout of other
visual elements to be displayed from left to right. For example, in a menu
bar, the cascade buttons would be laid out from left to right so that a simple
menu bar would position the "File" cascade button in the upper left corner,
and the "Help" cascade button would appear in the upper right corner of the
menu bar.</para>
<para>Languages such as Arabic and Hebrew are read and written from right
to left. To display text correctly in these languages on the screen, each
successive character that a user enters must appear to the left of the preceding
character. Using the example above for layout of other visual elements, these
users would expect a menu bar to lay out cascade buttons from right to left.
The result would typically position the "File" cascade button in the upper
right corner and the "Help" cascade button in the upper left corner of the
menu bar. For more information, on layout direction, refer to &MotifProgGd;.
</para>
</sect2>
<sect2 id="IPG.intro.div.22b">
<title>Vertical Writing</title>
<indexterm><primary>vertical writing</primary></indexterm>
<para>In some Asian languages, texts are drawn vertically. When the <classname>VendorShell</classname> resource <systemitem class="resource">XmNlayoutDirection</systemitem> is set to <systemitem class="constant">XmTOP_TO_BOTTOM</systemitem>,
the vertical writing feature is enabled. In addition to drawing texts vertically,
this feature adapts the text widget in other ways appropriate for the user.
For example, when word wrapping is turned on, the text wraps from the bottom
of one column to the top of the next column. For more information on vertical
writing, refer to &MotifProgGd;.</para>
</sect2>
</sect1>
<sect1 id="IPG.intro.div.23">
<title id="IPG.intro.mkr.20">Interclient Communications Conventions (ICCC)<indexterm>
<primary>National Language Support</primary><secondary>internationalized ICCC</secondary></indexterm></title>
<para>The Interclient Communications Conventions (ICCC) defines the mechanism
used to pass text between clients. Because the system is capable of supporting
multiple code sets, it may be possible that two applications that are communicating
with each other are using different code sets. ICCC defines how these two
clients agree on how the data is passed between them. If two clients have
incompatible character sets (for example, Latin1 and Japanese (JIS)), some
data may be lost when characters are transported.<indexterm><primary>libXm
library</primary></indexterm> <computeroutput><indexterm><primary>dtterm command</primary><secondary>ICCC</secondary></indexterm></computeroutput></para>
<para>However, if two clients have different code sets but compatible character
sets, ICCC enables these clients to pass information with no data lost. If
code sets of the two clients are not identical, CompoundText encoding is
used as the interchange with the <computeroutput>COMPOUND_TEXT</computeroutput>
atom used. If data being communicated involves only portable characters (7-bit,
ASCII, and others) or the ISO8859-1 code set, the data is communicated as
is with no conversion by way of the <computeroutput>XA_STRING</computeroutput>
atom.</para>
<para>Titles and icon names need to be communicated to the Window Manager
using the <computeroutput>COMPOUND_TEXT</computeroutput> atom if nonportable
characters are used; otherwise, the <computeroutput>XA_STRING</computeroutput>
atom can be used.
</para>
<para>Motif, for example, uses its own atoms to transfer
textual data:
</para>
<itemizedlist>
<listitem>
<para><literal>_MOTIF_COMPOUND_STRING</literal> transfers data in <Symbol>XmString</Symbol> format.
</para>
</listitem>
<listitem>
<para><literal>_MOTIF_RENDER_TABLE</literal> transfers the value of its render table
as type <Symbol>STRING</Symbol>
</para>
</listitem>
</itemizedlist>
<para>For more information, refer to <citetitle>Motif Widget Writer's Guide</citetitle>.
</para>
<para>Any other encoding is limited
to the ability to convert to the locale of the Window Manager. The Window
Manager runs in a single locale and supports only titles and icon names that
are convertible to the code set of the locale under which it is running.<indexterm>
<primary>National Language Support</primary><secondary>Window Manager</secondary>
<tertiary>communicating titles</tertiary></indexterm><indexterm><primary>National Language Support</primary><secondary>Window Manager</secondary><tertiary>communicating icon names</tertiary></indexterm><indexterm><primary>Window
Manager</primary><secondary>communicating titles and icon names</secondary>
</indexterm></para>
<para>The Motif library and all desktop clients should follow these conventions.
</para>
</sect1>
</chapter>
<!--fickle 1.14 mif-to-docbook 1.7 01/02/96 04:19:51-->

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,906 @@
<!-- $XConsortium: ch03.sgm /main/14 1996/10/30 14:31:59 rws $ -->
<!-- (c) Copyright 1995 Digital Equipment Corporation. -->
<!-- (c) Copyright 1995 Hewlett-Packard Company. -->
<!-- (c) Copyright 1995 International Business Machines Corp. -->
<!-- (c) Copyright 1995 Sun Microsystems, Inc. -->
<!-- (c) Copyright 1995 Novell, Inc. -->
<!-- (c) Copyright 1995 FUJITSU LIMITED. -->
<!-- (c) Copyright 1995 Hitachi. -->
<chapter id="IPG.distr.div.1">
<title id="IPG.distr.mkr.1"><indexterm><primary>distributed internationalization
guidelines</primary></indexterm>Internationalization and Distributed Networks</title>
<para>This chapter discusses tasks related to internationalization and distributed
networks.</para>
<para id="IPG.distr.mkr.2"></para>
<sect1 id="IPG.distr.div.2">
<title id="IPG.distr.mkr.3">Interchange Concepts</title>
<para>This section describes the way 8-bit<indexterm><primary>basic interchange
in a network</primary></indexterm> user names and 8-bit data can be<indexterm>
<primary>networks</primary></indexterm> communicated on a network for communications
utilities, such as ftp, mail, or interclient communication between the desktop
clients.</para>
<para>There are three primary<indexterm><primary>networks</primary></indexterm> considerations
for communicating data:<literal><indexterm><primary>interfaces</primary><secondary>for network communications</secondary></indexterm></literal></para>
<itemizedlist remap="Bullet1"><listitem><para>Sender's code set and the receiver's
code set.</para>
</listitem><listitem><para>Whether the communications protocol allows 8-bit
data or is limited to 7-bit coded data (for example, the Japanese JUNET passes
Japanese Industrial Standard (JIS) coded data over 7-bit protocols).</para>
</listitem><listitem><para>Type of interchange encoding available, per protocol
rules. The actual conversion needed is dependent on the specific protocol
used.</para>
</listitem></itemizedlist>
<para>If the remote<indexterm><primary>code sets</primary><secondary>network
remote host</secondary></indexterm> host uses the same code set as the local
host, the following is true:</para>
<itemizedlist remap="Bullet1"><listitem><para>If the protocol allows 8-bit
data, no conversions are needed.</para>
</listitem><listitem><para>If the protocol allows only 7-bit data, a method
is needed to map the 8-bit code points to 7-bit ASCII values. This could
be accomplished using the <command>iconv</command> framework and one of the
following types of 7-bit encoded methods:</para>
<itemizedlist remap="Bullet2"><listitem><para>Map 8-bit data as specified
in the POSIX.2 specification for uuencode and uudecode algorithms.</para>
</listitem><listitem><para>Optionally, the 8-bit data may be mapped to a 7-bit
interchange encoding as defined by the protocol; for example, 7-bit ISO2022
in Xlib or base64 in Multipurpose Internet Message Extensions (MIME).</para>
</listitem></itemizedlist>
</listitem></itemizedlist>
<para>If the remote<indexterm><primary>code sets</primary><secondary>network
local hosts</secondary></indexterm> host's code set is different from that
of the local host, the following two cases may apply. The conversion needed
is dependent on the specific protocol used.</para>
<itemizedlist remap="Bullet1"><listitem><para>If the protocol allows 8-bit
data, the protocol will need to specify which side does the <command>iconv</command> conversion and to specify the encoding on the wire. In some protocols,
an 8-bit interchange encoding is recommended that is capable of encoding
all possible code sets and identifying character repertoire.</para>
</listitem><listitem><para>If the protocol allows only 7-bit data, a 7-bit
interchange encoding is needed, as is the identifying character repertoire.
</para>
</listitem></itemizedlist>
<sect2 id="IPG.distr.div.3">
<title>iconv<indexterm><primary>iconv</primary><secondary>interface</secondary>
</indexterm> Interface</title>
<para>In a network environment, the code sets of the communicating systems
and the protocols of communication determine the transformation of user-specified
data so that it can be sent to the remote system in a meaningful way. The
user data (not user names) may need to be transformed from the sender's code
set to the receiver's code set, or 8-bit data may need to be transformed
into a 7-bit form to conform to protocols. A uniform interface is needed
to accomplish this.</para>
<para>In the following examples, using the <command>iconv</command> interface
is illustrated by explaining how to use <filename>iconv_open()</filename>, <filename>iconv(),</filename> and <filename>iconv_close()</filename>. To do the conversion, <filename>iconv_open()</filename> must be followed by <filename>iconv()</filename>.
The terms <emphasis>7-bit interchange</emphasis> and <emphasis>8-bit interchange</emphasis> are used to refer to any interchange encoding used for 7-bit
and 8-bit data, respectively.</para>
<sect3 id="IPG.distr.div.4">
<title>Sender and Receiver Use the Same Code Sets:</title>
<itemizedlist remap="Bullet1"><listitem><para>If the protocol allows 8-bit
data, use 8-bit data because the same code set is being used. No conversion
is needed.</para>
</listitem><listitem><para>If the protocol allows only 7-bit data, use <computeroutput>iconv</computeroutput>:</para>
<itemizedlist remap="Bullet2"><listitem><para>Sender</para>
<programlisting>cd = iconv_open(locale_codeset, uuencoded);</programlisting>
</listitem><listitem><para>Receiver</para>
<programlisting>cd = iconv_open(&ldquo;uucode&rdquo;, locale_codeset);</programlisting>
</listitem></itemizedlist>
</listitem></itemizedlist>
<sect4 id="ipg.distr.div.5">
<title>Sender and Receiver Use Different Code Sets:</title>
<itemizedlist remap="Bullet1"><listitem><para>If the protocol allows 8-bit
data:</para>
<itemizedlist remap="Bullet2"><listitem><para>Sender</para>
<programlisting>cd = iconv_open(locale_codeset,<symbol role="Variable">8-bitinterchange</symbol>);</programlisting>
</listitem><listitem><para>Receiver</para>
<programlisting>cd = iconv_open(<symbol role="Variable">8-bitinterchange</symbol>, locale_codeset);</programlisting>
</listitem></itemizedlist>
</listitem><listitem><para>If the protocol allows only 7-bit data, do the
following:</para>
<itemizedlist remap="Bullet2"><listitem><para>Sender</para>
<programlisting>cd = iconv_open(locale_codeset, <symbol role="Variable">7-bitinterchange</symbol>);</programlisting>
</listitem><listitem><para>Receiver</para>
<programlisting>cd = iconv_open(<symbol role="Variable">7-bitinterchange</symbol>, locale_codeset);</programlisting>
</listitem></itemizedlist>
</listitem></itemizedlist>
<para>The <computeroutput>locale_codeset</computeroutput> refers to the code
set being used locally by the application. Note that while the <computeroutput>nl_langinfo(CODESET)</computeroutput> function may be used to obtain the
code set associated with the current locale, it is implementation-dependent
whether any conversion names match the return from the <computeroutput>nl_langinfo(CODESET)</computeroutput> function.</para>
<para>The Table 3-1 outlines how <command>iconv</command> can be used to perform conversions for various conditions. Specific
protocols may dictate other conversions needed.</para>
<para><emphasis>Using iconv to Perform Conversion</emphasis></para>
<informaltable id="ipg.distr.itbl.2">
<tgroup cols="5" colsep="0" rowsep="1">
<colspec colname="col1" colwidth="0.93in">
<colspec colname="col2" colwidth="0.97in">
<colspec colname="col3" colwidth="0.97in">
<colspec colname="col4" colwidth="1.05in">
<colspec colname="col5" colwidth="1.10in">
<spanspec nameend="col3" namest="col2" spanname="2to3">
<spanspec nameend="col5" namest="col4" spanname="4to5">
<spanspec nameend="col5" namest="col1" spanname="1to5">
<tbody>
<row>
<entry align="left" valign="top"></entry>
<entry align="left" spanname="2to3" valign="top"><para><literal>Communication
with system using the same code set (for example, XYZ)</literal></para></entry>
<entry align="left" spanname="4to5" valign="top"><para><literal>Communication
with system using different code sets or receiver's code set is unknown</literal></para></entry></row>
<row>
<entry align="left" valign="top"><para><literal>Conversion to Use</literal></para></entry>
<entry align="left" valign="top"><para><literal>7-bit Protocol</literal></para></entry>
<entry align="left" valign="top"><para><literal>8-bit Protocol</literal></para></entry>
<entry align="left" valign="top"><para><literal>7-bit Protocol</literal></para></entry>
<entry align="left" valign="top"><para><literal>8-bit Protocol</literal></para></entry>
</row>
<row>
<entry align="left" valign="top"><para>code XYZ</para></entry>
<entry align="left" valign="top"><para>Invalid</para></entry>
<entry align="left" valign="top"><para>Best Choice</para></entry>
<entry align="left" valign="top"><para>Invalid</para></entry>
<entry align="left" valign="top"><para>Invalid if remote code set is unknown
</para></entry></row>
<row>
<entry align="left" valign="top"><para>7-bit Interchange ISO2022</para></entry>
<entry align="left" valign="top"><para>OK</para></entry>
<entry align="left" valign="top"><para>OK</para></entry>
<entry align="left" valign="top"><para>Best Choice</para></entry>
<entry align="left" valign="top"><para>OK</para></entry></row>
<row>
<entry align="left" valign="top"><para>8-bit Interchange ISO2022 ISO 10646
</para></entry>
<entry align="left" valign="top"><para>Invalid <superscript>1</superscript></para></entry>
<entry align="left" valign="top"><para>OK</para></entry>
<entry align="left" valign="top"><para>Invalid</para></entry>
<entry align="left" valign="top"><para>Best Choice</para></entry></row>
<row>
<entry align="left" valign="top"><para>7-bit Untagged quoted- printable
uucode</para></entry>
<entry align="left" valign="top"><para>OK</para></entry>
<entry align="left" valign="top"><para>OK</para></entry>
<entry align="left" valign="top"><para>Requires code set identification
</para></entry>
<entry align="left" valign="top"><para>Requires code set identification
</para></entry></row>
<row rowsep="0">
<entry align="left" valign="top"><para>8-bit Untagged base64</para></entry>
<entry align="left" valign="top"><para>Invalid</para></entry>
<entry align="left" valign="top"><para>OK</para></entry>
<entry align="left" valign="top"><para>Requires code set identification
</para></entry>
<entry align="left" valign="top"><para>Requires code set identification
</para></entry></row>
<row>
<entry align="left" spanname="1to5" valign="top"><para><footnoteref linkend="ipg.distr.fn.10"></footnoteref><footnote
id="ipg.distr.fn.10"><para><superscript>1</superscript>Invalid means the interchange
encoding should not be used for the choice of code set and type of protocol.
</para>
</footnote></para></entry></row></tbody></tgroup></informaltable>
</sect4>
</sect3>
</sect2>
<sect2 id="IPG.distr.div.6">
<title>Stateful and Stateless<indexterm><primary>code sets</primary><secondary>stateful encodings</secondary></indexterm> Conversions</title>
<para>Code<indexterm><primary>code sets</primary><secondary>stateless encodings</secondary></indexterm> sets can be classified into two categories: stateful
encodings and stateless encodings.</para>
<sect3 id="IPG.distr.div.7">
<title><indexterm><primary>stateful and stateless encodings, conversion of</primary></indexterm>Stateful Encodings</title>
<para>Stateful encoding uses sequences of control codes, such as shift-in/shift-out,
to change character sets associated with specific code values.</para>
<para>For instance, under compound text, the control sequence &ldquo;ESC$(B&rdquo;
can be used to indicate the start of Japanese 16-bit data in a data stream
of characters, and &ldquo;ESC(B&rdquo; can be used to indicate the end of
this double-byte character data and the start of 8-bit ASCII data. Under
this stateful encoding, the bit value 0x43 could not be interpreted without
knowing the shift state. The EBCDIC Asian code sets use shift-in/shift-out
controls to swap between double- and single- byte encodings, respectively.
</para>
<para>Converters that are written to do the conversion of stateful encodings
to other code sets tend to be a little complex due to the extra processing
needed.</para>
</sect3>
<sect3 id="IPG.distr.div.8">
<title><indexterm><primary>conversions</primary><secondary>stateless encodings</secondary></indexterm>Stateless Encodings</title>
<para>Stateless code sets are those that can be classified as one of two types:
</para>
<itemizedlist remap="Bullet1"><listitem><para>Single-byte code sets, such
as the ISO8859 family</para>
</listitem><listitem><para>Multibyte code sets, such as PC codes for Japanese
and Shift-JIS (SJIS)</para>
</listitem></itemizedlist>
<para>The term <emphasis>multibyte code sets</emphasis> is also used to refer
to any code set that needs one or more bytes to encode a character; multibyte
code sets are considered stateless.</para>
<note>
<para>Conversions are meaningful only if the code sets represent the same
character set.</para>
</note>
</sect3>
</sect2>
</sect1>
<sect1 id="IPG.distr.div.9">
<title id="IPG.distr.mkr.4">Simple Text Basic Interchange</title>
<para>When a<indexterm><primary>conversions</primary><secondary>stateful
code sets</secondary></indexterm><indexterm><primary>conversions</primary>
<secondary>simple text</secondary></indexterm> program communicates data to
another program residing on a remote host, a need may arise for conversion
of data from the code set of the source machine to that of the receiver.
For example, this happens when a PC system using PC codes needs to communicate
with a workstation using an International Organization for Standardization/Extended
UNIX Code (ISO/EUC) encoding. Another example occurs when a program obtains
data in one code set but has to display this data in another code set. To
support these conversions, a standard program interface is provided based
on the XPG4 <filename>iconv()</filename> function definitions.</para>
<para>All components doing code set conversion should use the <command>iconv</command> functions as their interface to conversions. Systems are expected
to provide a wide variety of conversions, as well as a mechanism to customize
the default set of conversions.</para>
<sect2 id="IPG.distr.div.10">
<title>iconv Conversion Functions<indexterm><primary>iconv</primary><secondary>text conversion functions</secondary></indexterm></title>
<para>The<indexterm><primary>conversions</primary><secondary>iconv text</secondary></indexterm> common method of conversions from one code set to
another is through a table-driven method. In some cases, these tables may
be too large, hence an algorithmic method may be more desirable. To accommodate
such diverse requirements, a framework is defined in XPG4 for code set conversions.
In this framework, to convert from one code set to another, open a converter,
perform the conversions, and close the converter. The <command>iconv</command> functions
are <filename>iconv_open()</filename>, <filename>iconv()</filename>, and <filename>iconv_close()</filename>.</para>
<para>Code set converters are brought under the framework of the <filename>iconv_open()</filename>, <filename>iconv()</filename>, and <filename>iconv_close()</filename> set of functions. With these functions, it is possible to provide
and to use several different types of converters. Applications can call these
functions to convert<indexterm><primary>simple text conversion functions</primary></indexterm> characters in one code set into characters in another
code set. With the advent of the <command>iconv</command> framework, converters
can be provided in a uniform manner. The access and use of these converters
is being standardized under X/Open XPG4.</para>
</sect2>
<sect2 id="ipg.distr.div.11">
<title>X Interclient (ICCCM) Conversion<indexterm><primary>X interclient
(ICCCM) conversion functions</primary></indexterm> Functions</title>
<para>Xlib<indexterm><primary>conversions</primary><secondary>Xlib</secondary>
</indexterm> provides the following functions for doing conversions.</para>
<informaltable>
<tgroup cols="2" colsep="0" rowsep="0">
<colspec align="left" colwidth="214*">
<colspec align="left" colwidth="314*">
<thead>
<row><entry align="left" valign="bottom"><para>X ICCCM Multibyte Functions
</para></entry><entry align="left" valign="bottom"><para>ICCCM Wide Character
Functions</para></entry></row></thead>
<tbody>
<row>
<entry align="left" valign="top"><para>XmbTextPropertyToTextList()</para></entry>
<entry align="left" valign="top"><para>XwcTextPropertyToTextList()</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>XmbTextListToTextProperty()</para></entry>
<entry align="left" valign="top"><para>XwcTextListToTextProperty()</para></entry>
</row></tbody></tgroup></informaltable>
<note>
<para>The <computeroutput>Motif</computeroutput> library does provide the <filename>XmCvtXmStringToCT()</filename> and
<filename>XmCvtCtToXmString()</filename> functions; however,
these are not recommended because there are some hardcoded assumptions about
certain XmString tags. For example, if the tag is <computeroutput>bold</computeroutput>, <filename>XmCvtXmStringToCT()</filename> is
implementation-dependent. Across various platforms, the behavior of this function
cannot be guaranteed in all international regions.</para></note>
</sect2>
<sect2 id="IPG.distr.div.12">
<title>Window Titles</title>
<para>The standard way for<indexterm><primary>titles for windows</primary>
</indexterm> setting titles is to use resources. But for applications that
set the titles of their windows directly, a localized title must be sent
to the Window Manager. Use the <command>XCompoundTextStyle</command> encoding
defined in <command>XICCEncodingStyle</command>, as well as the following
guidelines:</para>
<itemizedlist remap="Bullet1"><listitem><para>Compound<indexterm><primary>guidelines for window titles</primary></indexterm> text can be created either
by <computeroutput>XmbTextListToTextProperty()</computeroutput> or <computeroutput>XwcTextListToTextProperty()</computeroutput>.</para>
</listitem><listitem><para>Localized titles can be displayed using the <computeroutput>XmNtitle</computeroutput> and <computeroutput>XmNtitleEncoding</computeroutput>
resources of the <computeroutput>WMShell</computeroutput> widget. Localized
icon names can be displayed using the <computeroutput>XmNiconName</computeroutput>
and <computeroutput>XmNiconNameEncoding</computeroutput> resources of the <computeroutput>TopLevelShell</computeroutput> widget.</para>
</listitem><listitem><para>Localized titles of dialog boxes can also be displayed
using the <computeroutput>XmNdialogTitle</computeroutput> resource of the <computeroutput>XmBulletinBoard</computeroutput> widget.</para>
</listitem><listitem><para>Window Manager should have an appropriate fontlist
for displaying localized strings.</para>
</listitem></itemizedlist>
<para>Following is an example<indexterm><primary>examples of displaying
localized title and icon name</primary></indexterm> of displaying a localized
title and icon name. Compound text is made from the compound string in this
example.</para>
<programlisting>include &lt;nl_types.h>
Widget toplevel;
Arg al[10];
int ac;
XTextProperty title;
char *localized_string;
nl_catd fd;
XtSetLanguageProc( NULL, NULL, NULL );
fd = catopen( &ldquo;my_prog&rdquo;, 0 );
localized_string = catgets(fd, set_num, mes_num, &ldquo;<symbol>defaulttitle</symbol>&rdquo;);
XmbTextListToTextProperty( XtDisplay(toplevel), &amp;localized_string,
1, XCompoundTextStyle, &amp;title);
ac = 0;
XtSetArg(al[ac], XmNtitle, title.value); ac++;
XtSetArg(al[ac], XmNtitleEncoding, title.encoding); ac++;
XtSetValues(toplevel, al, ac);</programlisting>
<para>If you are using a window rather than widgets, the <computeroutput>XmbSetWMProperties()</computeroutput> function automatically converts a localized
string into the proper <computeroutput>XICCEncodingStyle</computeroutput>.
</para>
</sect2>
</sect1>
<sect1 id="IPG.distr.div.13">
<title id="IPG.distr.mkr.5">Mail Basic Interchange</title>
<para>In general, electronic mail (email) strategy has been one of turning
email into a canonical, labeled format as opposed to optimizing a message
given knowledge of the receiver's locale. This means that in the email world,
you should always assume that the receiver <emphasis>may</emphasis> be in
a different locale. In the desktop world, the default email transport is
Simple Mail Transfer Protocol (SMTP), which only supports 7-bit transmission
channels.</para>
<para>With this understanding, the email strategy for the desktop is as follows:
</para>
<itemizedlist remap="Bullet1"><listitem><para>The sending agents, by default
(unless instructed otherwise by the user), converts a body part into a <emphasis>standard</emphasis> format for the sending transmission channel and labels
the body part with the character encoding used.</para>
</listitem><listitem><para>The receiving agent looks at the body part to see
if it can support the character encoding; if it can, it converts it into
the local character set.</para>
</listitem></itemizedlist>
<para>In addition, because the MIME format is used for messages, any 8-bit
to 7-bit transformations are done using the built-in MIME transport encodings
(base64 or quoted-printable). See the Request for Comments (RFC) 1521 MIME
standard specification.</para>
</sect1>
<sect1 id="IPG.distr.div.14">
<title id="IPG.distr.mkr.6">Encodings and Code Sets</title>
<para>To<indexterm><primary>encodings</primary></indexterm> understand code
sets, it is necessary to first understand character sets. A <emphasis>character
set</emphasis> is a collection of predefined characters based on the specific
needs of one or more languages without regard to the encoding values used
to represent the characters. The choice of which code set to use depends
on the user's data processing requirements. A particular character set can
be encoded using different encoding schemes. For example, the ASCII character
set defines the set of characters found in the English language. The Japanese
Industrial Standard (JIS) character set defines the set of characters used
in the Japanese language. Both the English and Japanese character sets can
be encoded using different code sets.</para>
<para>The ISO2022 standard defines a coded character set as a group of precise
rules that defines a character set and the one-to-one relationship between
each character and its bit pattern. A code set defines the bit patterns that
the system uses to identify characters.</para>
<para>A<indexterm><primary>code page</primary></indexterm> code page is similar
to a code set with the limitation that a code-page specification is based
on a 16-column by 16-row matrix. The intersection of each column and row
defines a coded character.</para>
<sect2 id="IPG.distr.div.15">
<title><indexterm><primary>code sets</primary><secondary>strategy</secondary>
</indexterm>Code Set Strategy</title>
<para>The common open software environment code set support is based on International
Organization for Standardization (ISO) and industry-standard code sets providing
industry-standard code sets that satisfy the data processing needs of users.
</para>
<para>Each locale in the system defines which code set it uses and how the
characters within the code set are manipulated. Because multiple locales
can be installed on the system, multiple code sets can be used by different
users on the system. While the system can be configured with locales using
different code sets, all system utilities assume that the system is running
under a single code set.</para>
<para>Most commands have no knowledge of the underlying code set being used
by the locale. The knowledge of code sets is hidden by the code-set-independent
library subroutines (Internationalization libraries), which pass information
to the code-set-dependent subroutines.</para>
<para>Because many programs rely on ASCII, all code sets include the 7-bit
ASCII code set as a proper subset. Because the 7-bit ASCII code set is common
to all supported code sets, its characters are sometimes referred to as the <emphasis>portable</emphasis> character set.</para>
<para>The 7-bit ASCII code set is based on the ISO646 definition and contains
the control characters, punctuation characters, digits (0-9), and the English
alphabet in uppercase and lowercase.</para>
</sect2>
<sect2 id="IPG.distr.div.16">
<title><indexterm><primary>code sets</primary><secondary>structure</secondary>
</indexterm>Code Set Structure</title>
<para>Each code set is divided into two principle areas:</para>
<itemizedlist remap="Bullet1"><listitem><para>Graphic Left (GL) Columns 0-7
</para>
</listitem><listitem><para>Graphic Right (GR) Columns 8-F</para>
</listitem></itemizedlist>
<para>The first two columns of each code set are reserved by ISO standards
for control characters. The terms C0 and C1 are used to denote the control
characters for the Graphic Left and Graphic Right areas, respectively.</para>
<note>
<para>The PC code sets use the C1 control area to encode graphic characters.
</para>
</note>
<para>The remaining six columns are used to encode graphic characters (see
<!--Original XRef content: 'Table&numsp;3&hyphen;2
on page&numsp;65'--><xref role="CodeOrFigOrTabAndPNum" linkend="IPG.distr.mkr.7">).
Graphic characters are considered to be printable characters, while the control
characters are used by devices and applications to indicate some special
function</para>
<para><emphasis id="IPG.distr.mkr.7">Code Set Overview</emphasis></para>
<graphic id="IPG.distr.igrph.1" entityref="IPG.distr.fig.1"></graphic>
<sect3 id="IPG.distr.div.17">
<title>Control Characters</title>
<para>Based on the ISO<indexterm><primary>code sets</primary><secondary>control characters</secondary></indexterm> definition, a control character
initiates, modifies, or stops a control operation. A control character is
not a graphic character, but can have graphic representation in some instances.
The control characters in the ISO646- IRV character set are present in all
supported code sets, and the encoded values of the C0 control characters
are consistent throughout the code sets.</para>
</sect3>
<sect3 id="IPG.distr.div.18">
<title>Graphic Characters</title>
<para>Each<indexterm><primary>code sets</primary><secondary>graphic characters</secondary></indexterm> code set can be considered to be divided into one
or more character sets, such that each character is given a unique coded
value. The ISO standard reserves six columns for encoding characters and
does not allow graphic characters to be encoded in the control character
columns.</para>
</sect3>
<sect3 id="IPG.distr.div.19">
<title>Single-Byte Code Sets</title>
<para>Code sets<indexterm><primary>code sets</primary><secondary>single-byte</secondary></indexterm> that use all 8 bits of a byte can support European,
Middle Eastern, and other alphabetic languages. Such code sets are called
single-byte code sets. This provides a limit of encoding 191 characters,
not including control characters.</para>
</sect3>
<sect3 id="IPG.distr.div.20">
<title>Multibyte Code Sets<indexterm><primary>code sets</primary><secondary>multibyte</secondary></indexterm></title>
<para>The term <emphasis>multibyte code sets</emphasis> is used to refer to
all possible code sets regardless of the number of bytes needed to encode
any specific character. Because the operating system should be capable of
supporting any number of bits to encode a character, a multibyte code set
may contain characters that are encoded with 8, 16, 32, or more bits. Even
single-byte code sets are considered to be multibyte code sets.</para>
</sect3>
<sect3 id="IPG.distr.div.21">
<title>Extended UNIX Code (EUC)<indexterm><primary>code sets</primary><secondary>extended UNIX code (EUC)</secondary></indexterm> Code Set</title>
<para>The EUC code set uses control characters to identify characters in some
of the character sets. The encoding rules are based on the ISO2022 definition
for the encoding of 7-bit and 8-bit data. The EUC code set uses control characters
to separate some of the character sets.</para>
<para>The term EUC denotes these general encoding rules. A code set based
on EUC conforms to the EUC encoding rules but also identifies the specific
character sets associated with the specific instances. For example, eucJP
for Japanese refers to the encoding of the JIS characters according to the
EUC encoding rules.</para>
<para>The first set (CS0) always contains an ISO646 character set. All of
the other sets must have the most-significant bit (MSB) set to 1, and they
can use any number of bytes to encode the characters. In addition, all characters
within a set must have:</para>
<itemizedlist remap="Bullet1"><listitem><para>Same number of bytes to encode
all characters</para>
</listitem><listitem><para>Same column display width (number of columns on
a fixed-width terminal)</para>
</listitem></itemizedlist>
<para>Each character in the third set (CS2) is always preceded with the control
character SS2 (single-shift 2, 0x8e). Code sets that conform to EUC do not
use the SS2 control character other than to identify the third set.</para>
<para>Each character in the fourth set (CS3) is always preceded with the control
character SS3 (single-shift 3, 0x8f). Code sets that conform to EUC do not
use the SS3 control character other than to identify the fourth set.</para>
</sect3>
</sect2>
<sect2 id="IPG.distr.div.22">
<title>ISO EUC Code Sets</title>
<para>The following<indexterm><primary>code sets</primary><secondary>ISO
EUC</secondary></indexterm> code sets<indexterm><primary>ISO EUC code set</primary></indexterm> are based on definitions set by the International Organization
for Standardization (ISO).</para>
<itemizedlist remap="Bullet1"><listitem><para>ISO646-IRV</para>
</listitem><listitem><para>ISO8859-1</para>
</listitem><listitem><para>ISO8859-x</para>
</listitem><listitem><para>eucJP</para>
</listitem><listitem><para>eucTW</para>
</listitem><listitem><para>eucKR</para>
</listitem></itemizedlist>
<sect3 id="IPG.distr.div.23">
<title>ISO646-IRV</title>
<para>The<indexterm><primary>ISO646-IRV code set</primary></indexterm> ISO646-IRV
code set<indexterm><primary>code sets</primary><secondary>ISO646-IRV, description</secondary></indexterm> defines the code set used for information processing
based on a 7-bit encoding. The character set associated with this code set
is derived from the ASCII characters.</para>
</sect3>
<sect3 id="IPG.distr.div.24">
<title>ISO8859-1</title>
<para>ISO8859-1<indexterm><primary>ISO8859-1 code set</primary></indexterm><indexterm>
<primary>code sets</primary><secondary>ISO8859-1, description</secondary>
</indexterm> encoding is a single-byte encoding that is based on and is compatible
with other ISO, American National Standards Institute (ANSI), and European
Computer Manufacturer's Association (ECMA) code extension techniques. The
ISO8859 encoding defines a family of code sets with each member containing
its own unique character sets. The 7-bit ASCII code set is a proper subset
of each of the code sets in the ISO8859 family.</para>
<para>The ISO8859-1 code set is called the ISO Latin-1 code set and consists
of two character sets:</para>
<itemizedlist remap="Bullet1"><listitem><para>ISO646-IRV Graphic Left, 7-bit
ASCII character set</para>
</listitem><listitem><para>ISO8859-1 Graphic Right (Latin) character set</para>
</listitem></itemizedlist>
<para>These character sets combined include the characters necessary for Western
European languages such as Danish, Dutch, English, Finnish, French, German,
Icelandic, Italian, Norwegian, Portuguese, Spanish, and Swedish.</para>
<para>While the ASCII code set defines an order for the English alphabet,
the Graphic Right (GR) characters are not ordered according to any specific
language. The language-specific ordering is defined by the locale.</para>
</sect3>
<sect3 id="IPG.distr.div.25">
<title>Other ISO8859<indexterm><primary>code sets</primary><secondary>ISO8859,
list of other</secondary></indexterm> Code Sets</title>
<para>This section lists the<indexterm><primary>ISO8859, other significant
code sets</primary></indexterm> other significant ISO8859 code sets. Each code
set includes the ASCII character set plus its own unique characters.</para>
<sect4 id="IPG.distr.div.26">
<title>ISO8859-2</title>
<para>Latin alphabet, No. 2, Eastern Europe</para>
<itemizedlist remap="Bullet1"><listitem><para>Albanian</para>
</listitem><listitem><para>Czechoslovakian</para>
</listitem><listitem><para>English</para>
</listitem><listitem><para>German</para>
</listitem><listitem><para>Hungarian</para>
</listitem><listitem><para>Polish</para>
</listitem><listitem><para>Rumanian</para>
</listitem><listitem><para>Serbo-Croatian</para>
</listitem><listitem><para>Slovak</para>
</listitem><listitem><para>Slovene</para>
</listitem></itemizedlist>
</sect4>
<sect4 id="IPG.distr.div.27">
<title>ISO8859-5</title>
<para>Latin/Cyrillic alphabet</para>
<itemizedlist remap="Bullet1"><listitem><para>Bulgarian</para>
</listitem><listitem><para>Byelorussian</para>
</listitem><listitem><para>English</para>
</listitem><listitem><para>Macedonian</para>
</listitem><listitem><para>Russian</para>
</listitem><listitem><para>Ukrainian</para>
</listitem></itemizedlist>
</sect4>
<sect4 id="IPG.distr.div.28">
<title>ISO8859-6</title>
<para>Latin/Arabic alphabet</para>
<itemizedlist remap="Bullet1"><listitem><para>English</para>
</listitem><listitem><para>Arabic</para>
</listitem></itemizedlist>
</sect4>
<sect4 id="IPG.distr.div.29">
<title>ISO8859-7</title>
<para>Latin/Greek alphabet</para>
<itemizedlist remap="Bullet1"><listitem><para>English</para>
</listitem><listitem><para>Greek</para>
</listitem></itemizedlist>
</sect4>
<sect4 id="IPG.distr.div.30">
<title>ISO8859-8</title>
<para>Latin/Hebrew alphabet</para>
<itemizedlist remap="Bullet1"><listitem><para>English</para>
</listitem><listitem><para>Hebrew</para>
</listitem></itemizedlist>
</sect4>
<sect4 id="IPG.distr.div.31">
<title>ISO8859-9</title>
<para>Latin/Turkish alphabet</para>
<itemizedlist remap="Bullet1"><listitem><para>Danish</para>
</listitem><listitem><para>Dutch</para>
</listitem><listitem><para>English</para>
</listitem><listitem><para>Finnish</para>
</listitem><listitem><para>French</para>
</listitem><listitem><para>German</para>
</listitem><listitem><para>Irish</para>
</listitem><listitem><para>Italian</para>
</listitem><listitem><para>Norwegian</para>
</listitem><listitem><para>Portuguese</para>
</listitem><listitem><para>Spanish</para>
</listitem><listitem><para>Swedish</para>
</listitem><listitem><para>Turkish</para>
</listitem></itemizedlist>
</sect4>
</sect3>
<sect3 id="IPG.distr.div.32">
<title>eucJP</title>
<para id="IPG.distr.mkr.8">The<indexterm><primary>eucJP code set</primary>
</indexterm> EUC<indexterm><primary>code sets</primary><secondary>eucJP,
description</secondary></indexterm> for Japanese consists of single-byte and
multibyte characters (2 and 3 bytes). The encoding conforms to ISO2022 and
is based on JIS and EUC definitions, see <!--Original XRef content: ''--><xref
role="CodeOrFigureOrTable" linkend="IPG.distr.mkr.8">.</para>
<table id="IPG.distr.tbl.2" frame="Topbot">
<title>Encoding for eucJP</title>
<tgroup cols="4" colsep="0" rowsep="0">
<colspec colwidth="1.01in">
<colspec colwidth="1.19in">
<colspec colwidth="1.50in">
<colspec colwidth="1.59in">
<tbody>
<row>
<entry align="left" valign="top"><para><Literal>CS</Literal></para></entry>
<entry align="left" valign="top"><para><literal>Encoding</literal></para></entry>
<entry align="left" valign="top"></entry>
<entry align="left" valign="top"><para><literal>Character Set</literal></para></entry>
</row>
<row>
<entry align="left" valign="top"><para>cs0</para></entry>
<entry align="left" valign="top"><para>0xxxxxxx</para></entry>
<entry align="left" valign="top"></entry>
<entry align="left" valign="top"><para>ASCII</para></entry></row>
<row>
<entry align="left" valign="top"><para>cs1</para></entry>
<entry align="left" valign="top"><para>1xxxxxxx</para></entry>
<entry align="left" valign="top"><para>1xxxxxxx</para></entry>
<entry align="left" valign="top"><para>JIS X0208-1990</para></entry></row>
<row>
<entry align="left" valign="top"><para>cs2</para></entry>
<entry align="left" valign="top"><para>0x8E</para></entry>
<entry align="left" valign="top"><para>1xxxxxxx</para></entry>
<entry align="left" valign="top"><para>JIS X0201-1976</para></entry></row>
<row>
<entry align="left" valign="top"><para>cs3</para></entry>
<entry align="left" valign="top"><para>0x8F</para></entry>
<entry align="left" valign="top"><para>1xxxxxxx 1xxxxxxx</para></entry>
<entry align="left" valign="top"><para>JIS X0212-1990</para></entry></row>
</tbody></tgroup></table>
<sect4 id="IPG.distr.div.33">
<title>JIS X0208-1990</title>
<para>A code of the Japanese graphic character set for information interchange
(1990 version) that contains 147 special characters, 10 numeric digits, 83
Hiragana characters, 86 Katakana characters, 52 Latin characters, 48 Greek
characters, 66 Cyrillic characters, 32 line-drawing elements, and 6355 Kanji
characters.</para>
</sect4>
<sect4 id="IPG.distr.div.34">
<title><emphasis role="Lead-in">JIS X0201</emphasis></title>
<para>A code for information interchange that contains 63 Katakana characters.
</para>
</sect4>
<sect4 id="IPG.distr.div.35">
<title><emphasis role="Lead-in">JIS X0212-1990</emphasis></title>
<para>A code of the supplementary Japanese graphic character set for information
interchange (1990 version) that contains 21 additional special characters,
21 additional Greek characters, 26 additional Cyrillic characters, 27 additional
Latin characters, 171 Latin characters with diacritical marks, and 5801
additional Kanji characters.</para>
</sect4>
</sect3>
<sect3 id="IPG.distr.div.36">
<title>eucTW</title>
<para id="IPG.distr.mkr.9">The EUC<indexterm><primary>code sets</primary>
<secondary>eucTW, description</secondary></indexterm> for<indexterm><primary>eucTW code set</primary></indexterm> Traditional Chinese is an encoding consisting
of characters that contain single-byte and multibyte (2 and 4 bytes) characters.
The EUC encoding conforms to ISO2022 and is based on the Chinese National
Standard (CNS) as defined by the Republic of China and the EUC definition,
see <!--Original XRef content: 'Table&numsp;3&hyphen;4'--><xref role="CodeOrFigureOrTable"
linkend="IPG.distr.mkr.10">.</para>
<table id="IPG.distr.tbl.3" frame="Topbot">
<title id="IPG.distr.mkr.10">Encoding for eucTW</title>
<tgroup cols="5" colsep="0" rowsep="0">
<colspec colwidth="0.51in">
<colspec colwidth="1.05in">
<colspec colwidth="0.91in">
<colspec colwidth="1.04in">
<colspec colwidth="2.31in">
<tbody>
<row>
<entry align="left" valign="top"><para><Literal>CS</Literal></para></entry>
<entry align="left" valign="top"><para><literal>Encoding</literal></para></entry>
<entry align="left" valign="top"></entry>
<entry align="left" valign="top"></entry>
<entry align="left" valign="top"><para><literal>Character Set</literal></para></entry>
</row>
<row>
<entry align="left" valign="top"><para>cs0</para></entry>
<entry align="left" valign="top"><para>0xxxxxxx</para></entry>
<entry align="left" valign="top"></entry>
<entry align="left" valign="top"></entry>
<entry align="left" valign="top"><para>ASCII</para></entry></row>
<row>
<entry align="left" valign="top"><para>cs1</para></entry>
<entry align="left" valign="top"><para>1xxxxxxx</para></entry>
<entry align="left" valign="top"><para>1xxxxxxx</para></entry>
<entry align="left" valign="top"></entry>
<entry align="left" valign="top"><para>CNS 11643.1992 - plane 1</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>cs2</para></entry>
<entry align="left" valign="top"><para>0x8EA2</para></entry>
<entry align="left" valign="top"><para>1xxxxxxx</para></entry>
<entry align="left" valign="top"><para>1xxxxxxx</para></entry>
<entry align="left" valign="top"><para>CNS 11643.1992 - plane 2</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>cs3</para></entry>
<entry align="left" valign="top"><para>0x8EA3</para></entry>
<entry align="left" valign="top"><para>1xxxxxxx</para></entry>
<entry align="left" valign="top"><para>1xxxxxxx</para></entry>
<entry align="left" valign="top"><para>CNS 11643.1992 - plane 3</para></entry>
</row>
<row>
<entry align="left" valign="top"></entry>
<entry align="left" valign="top"><para>0x8EB0</para></entry>
<entry align="left" valign="top"><para>1xxxxxxx</para></entry>
<entry align="left" valign="top"><para>1xxxxxxx</para></entry>
<entry align="left" valign="top"><para>CNS 11643.1992 - Plane 16</para></entry>
</row></tbody></tgroup></table>
<para>CNS 11643-1992 defines 16 planes for the Chinese Standard Interchange
Code, each plane can support up to 8836 characters (94x94). Currently, only
planes 1 through 7 have characters assigned. <!--Original XRef content:
'Table&numsp;3&hyphen;5'--><xref role="CodeOrFigureOrTable" linkend="IPG.distr.mkr.11"><indexterm>
<primary>CNS character definitions</primary></indexterm> shows the 16 planes
of the CNS 11643-1992 standard.</para>
<table id="IPG.distr.tbl.4" frame="Topbot">
<title id="IPG.distr.mkr.11">16 Planes of the CNS 11643-1992 Standard</title>
<tgroup cols="4" colsep="0" rowsep="0">
<colspec colname="col1" colwidth="0.67in">
<colspec colwidth="1.83in">
<colspec colwidth="1.08in">
<colspec colname="col4" colwidth="2.02in">
<spanspec nameend="col4" namest="col1" spanname="1to4">
<thead>
<row><entry align="left" valign="bottom"><para><literal>Plane</literal></para></entry>
<entry align="left" valign="bottom"><para><literal>Definition</literal></para></entry>
<entry align="left" valign="bottom"><para><literal># of Character</literal></para></entry>
<entry align="left" valign="bottom"><para><literal>EUC Encoding</literal></para></entry>
</row></thead>
<tbody>
<row>
<entry align="left" valign="top"><para>1</para></entry>
<entry align="left" valign="top"><para>Most frequently used</para></entry>
<entry align="left" valign="top"><para>6085</para></entry>
<entry align="left" valign="top"><para>A1A1-FDCB</para></entry></row>
<row>
<entry align="left" valign="top"><para>2</para></entry>
<entry align="left" valign="top"><para>Secondary frequently</para></entry>
<entry align="left" valign="top"><para>7650</para></entry>
<entry align="left" valign="top"><para>8EA2 A1A1 - 8EA2 F2C4</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>3</para></entry>
<entry align="left" valign="top"><para>Exec.Yuen EDP <superscript>1</superscript>
center</para></entry>
<entry align="left" valign="top"><para>6148</para></entry>
<entry align="left" valign="top"><para>8EA3 A1A1 - 8EA3 E2C6</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>4</para></entry>
<entry align="left" valign="top"><para>RIS<superscript>2</superscript>, Vendor
defined</para></entry>
<entry align="left" valign="top"><para>7298</para></entry>
<entry align="left" valign="top"><para>8EA4 A1A1 - 8EA4 EEDC</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>5</para></entry>
<entry align="left" valign="top"><para>Rarely used by MOE<superscript>3</superscript></para></entry>
<entry align="left" valign="top"><para>8603</para></entry>
<entry align="left" valign="top"><para>8EA5 A1A1 - 8EA5 FCD1</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>6</para></entry>
<entry align="left" valign="top"><para>Variation char set 1 by MOE</para></entry>
<entry align="left" valign="top"><para>6388</para></entry>
<entry align="left" valign="top"><para>8EA6 A1A1 - 8EA6 E4FA</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>7</para></entry>
<entry align="left" valign="top"><para>Variation char set 2 by MOE</para></entry>
<entry align="left" valign="top"><para>6539</para></entry>
<entry align="left" valign="top"><para>8EA7 A1A1 - 8EA7 E6D5</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>8</para></entry>
<entry align="left" valign="top"><para>Undefined</para></entry>
<entry align="left" valign="top"><para>0</para></entry>
<entry align="left" valign="top"><para>8EA8 A1A1 - 8EA8 FEFE</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>9</para></entry>
<entry align="left" valign="top"><para>Undefined</para></entry>
<entry align="left" valign="top"><para>0</para></entry>
<entry align="left" valign="top"><para>8EA9 A1A1 - 8EA9 FEFE</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>10</para></entry>
<entry align="left" valign="top"><para>Undefined</para></entry>
<entry align="left" valign="top"><para>0</para></entry>
<entry align="left" valign="top"><para>8EAA A1A1 - 8EAA FEFE</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>11</para></entry>
<entry align="left" valign="top"><para>Undefined</para></entry>
<entry align="left" valign="top"><para>0</para></entry>
<entry align="left" valign="top"><para>8EAB A1A1 - 8EAB FEFE</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>12</para></entry>
<entry align="left" valign="top"><para>User Defined Character (UDC)</para></entry>
<entry align="left" valign="top"><para>0</para></entry>
<entry align="left" valign="top"><para>8EAC A1A1 - 8EAC FEFE</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>13</para></entry>
<entry align="left" valign="top"><para>UDC</para></entry>
<entry align="left" valign="top"><para>0</para></entry>
<entry align="left" valign="top"><para>8EAD A1A1 - 9EAD FEFE</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>14</para></entry>
<entry align="left" valign="top"><para>UDC</para></entry>
<entry align="left" valign="top"><para>0</para></entry>
<entry align="left" valign="top"><para>8EAE A1A1 - 8EAE FEFE</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>15</para></entry>
<entry align="left" valign="top"><para>UDC</para></entry>
<entry align="left" valign="top"><para>0</para></entry>
<entry align="left" valign="top"><para>8EAF A1A1 - 8EAF FEFE</para></entry>
</row>
<row>
<entry align="left" valign="top"><para>16</para></entry>
<entry align="left" valign="top"><para>UDC</para></entry>
<entry align="left" valign="top"><para>0</para></entry>
<entry align="left" valign="top"><para>8EB0 A1A1 - 8EB0 FEFE</para></entry>
</row>
<row>
<entry align="left" spanname="1to4" valign="top"><para><superscript>1</superscript>
EDP: Center of Directorate, General of Budget, Accounting, and Statistics
</para></entry></row>
<row>
<entry align="left" spanname="1to4" valign="top"><para><superscript>2</superscript>
RIS: Residence Information System</para></entry></row>
<row>
<entry align="left" spanname="1to4" valign="top"><para><superscript>3</superscript>
MOE: Ministry of Education</para></entry></row></tbody></tgroup></table>
</sect3>
<sect3 id="IPG.distr.div.37">
<title>eucKR</title>
<para>The EUC<indexterm><primary>code sets</primary><secondary>eucKR, description</secondary></indexterm> for Korean is<indexterm><primary>eucKR code set</primary></indexterm> an encoding consisting of single-byte and multibyte
characters (shown in <!--Original XRef content: 'Table&numsp;3&hyphen;6'--><xref
role="CodeOrFigureOrTable" linkend="IPG.distr.mkr.12">). The encoding conforms
to ISO2022 and is based on Korean Standard Code (KSC) set and EUC definitions.
</para>
<table id="IPG.distr.tbl.5" frame="Topbot">
<title id="IPG.distr.mkr.12">Encoding for eucKR.</title>
<tgroup cols="4">
<colspec colname="1" colwidth="1.24132 in">
<colspec colname="2" colwidth="1.24132 in">
<colspec colname="3" colwidth="1.24132 in">
<colspec colname="4" colwidth="1.24132 in">
<thead>
<row><entry><para><Literal>CS</Literal></para></entry><entry><para><literal>Encoding</literal></para></entry><entry></entry><entry><para><literal>Character
Set</literal></para></entry></row></thead>
<tbody>
<row>
<entry><para>cs0</para></entry>
<entry><para>0xxxxxxx</para></entry>
<entry></entry>
<entry><para>ASCII</para></entry></row>
<row>
<entry><para>cs1</para></entry>
<entry><para>1xxxxxxx</para></entry>
<entry><para>1xxxxxxx</para></entry>
<entry><para>KS C 5601-1992</para></entry></row>
<row>
<entry><para>cs2</para></entry>
<entry></entry>
<entry></entry>
<entry><para>Not used</para></entry></row>
<row>
<entry><para>cs3</para></entry>
<entry></entry>
<entry></entry>
<entry><para>Not used</para></entry></row></tbody></tgroup></table>
<para>KSC 5601-1992 (code of the Korean character set for information interchange,
1992 version) contains 432 special characters, 30 Arabic and Roman numeral
characters, 94 Hangul alphabet characters, 52 Roman characters, 48 Greek
characters, 27 Latin characters, 169 Japanese characters, 66 Russian characters,
68 line-drawing elements, 2344 precomposed Hangul characters, and 4888 Hanja
characters.</para>
<para>The Hangul characters represent the sounds of the Korean words. Each
Hangul character is composed of from one to three of the Hangul elementary
phonetic signs: an initial consonant (if any), a vowel, and a final consonant
(if any). Many Korean words can also be written with Traditional Chinese
characters (called Hanja in Korean). In traditional times, Korean texts were
generally written in a mixture of Hangul and Hanja: Hanja for the main words
(nouns, verbs, modifiers) and Hangul for the particles and grammatical inflections.
In recent times, most Korean texts are written purely in Hangul, although
personal names may still appear written with Hanja.</para>
</sect3>
</sect2>
</sect1>
</chapter>
<!--fickle 1.14 mif-to-docbook 1.7 01/02/96 04:19:51-->

View File

@@ -0,0 +1,633 @@
<!-- $XConsortium: ch04.sgm /main/8 1996/09/08 19:39:21 rws $ -->
<!-- (c) Copyright 1995 Digital Equipment Corporation. -->
<!-- (c) Copyright 1995 Hewlett-Packard Company. -->
<!-- (c) Copyright 1995 International Business Machines Corp. -->
<!-- (c) Copyright 1995 Sun Microsystems, Inc. -->
<!-- (c) Copyright 1995 Novell, Inc. -->
<!-- (c) Copyright 1995 FUJITSU LIMITED. -->
<!-- (c) Copyright 1995 Hitachi. -->
<chapter id="IPG.motif.div.1">
<title id="IPG.motif.mkr.1">Xt, Xlib, and Motif Dependencies</title>
<para>For information on Xt and Xlib dependencies, refer to Chapter 13 of <citetitle>Xlib &mdash; C Language Interface</citetitle>.</para>
<para>The rest of this chapter discusses tasks related to internationalizing
with Motif.</para>
<sect1 id="IPG.motif.div.2">
<title id="IPG.motif.mkr.2">Locale Management<indexterm><primary>language
environment</primary><secondary>description</secondary></indexterm><indexterm>
<primary>environment, language</primary></indexterm></title>
<para>The term <emphasis>language environment</emphasis> refers to the set
of localized data that the application needs to run correctly in the user-specified
locale. A language environment supplies the rules associated with a specific
language. In addition, the language environment consists of any externally
stored data, such as localized strings or text used by the application. For
example, the menu items displayed by an application might be stored in separate
files for each language supported by the application. This type of data can
be stored in resource files, User Interface Definition (UID) files, or message
catalogs (on XPG3-compliant systems).</para>
<para>A single language environment is established when an application runs.
The language environment in which an application operates is specified by
the application user, often either by setting an environment variable ( <systemitem>LANG</systemitem> or <systemitem>LC_*</systemitem> on POSIX-based systems)
or by setting the xnlLanguage resource. The application then sets the language
environment based on the user's specification. The application can do this
by using the <computeroutput>setlocale()</computeroutput> function in a language
procedure established by the <computeroutput>XtSetLanguageProc()</computeroutput>
function. This causes Xt to cache a per-display language string that is used
by the <computeroutput>XtResolvePathname()</computeroutput> function to find
resource, bitmap, and User Interface Language (UIL) files.</para>
<para>An application that supplies a language procedure can either provide
its own procedure or use an Xt<indexterm><primary>language procedure</primary>
</indexterm> default procedure. In either case, the application establishes
the language procedure by calling the <computeroutput>XtSetLanguageProc()</computeroutput> function before initializing the toolkit and before loading
the resource databases (such as by calling the <computeroutput>XtAppInitialize()</computeroutput> function). When a language procedure is installed, Xt calls
it in the process of constructing the initial resource database. Xt uses
the value returned by the language procedure as its per-display language
string.</para>
<para>The default language procedure performs the following tasks:</para>
<itemizedlist remap="Bullet1"><listitem><para>Sets the locale. This is done
by using:</para>
<programlisting>setlocale(LC_ALL, <symbol role="Variable">language</symbol>);</programlisting>
<para>where <symbol role="Variable">language</symbol> is the value of the <systemitem>xnlLanguage</systemitem> resource, or the empty string (&ldquo;&rdquo;) if
the <systemitem>xnlLanguage</systemitem> resource is not set. When the <systemitem>xnlLanguage</systemitem> resource is not set, the locale is generally derived
from an environment variable (<systemitem>LANG</systemitem> on POSIX-based
systems).</para>
</listitem><listitem><para>Calls the <computeroutput>XSupportsLocale()</computeroutput>
function to verify that the locale just set is supported. If not, a warning
message is issued and the locale is set to C.</para>
</listitem><listitem><para>Calls the <computeroutput>XSetLocaleModifiers()</computeroutput> function specifying the empty string.</para>
</listitem><listitem><para>Returns the value of the current locale. On ANSI
C-based systems, this is the result of calling:</para>
<programlisting>setlocale(LC_ALL, NULL);</programlisting>
</listitem></itemizedlist>
<para>The application can use the default language procedure by making the
call to the <computeroutput>XtSetLanguageProc()</computeroutput> function
in the following manner:</para>
<programlisting>XtSetLanguageProc(NULL, NULL, NULL);
...
toplevel = XtAppInitialize(...);</programlisting>
<para>By default, Xt does not install any language procedure. If the application
does not call the <computeroutput>XtSetLanguageProc()</computeroutput> function,
Xt uses as its per-display language string the value of the <systemitem>xnlLanguage</systemitem> resource if it is set. If the <systemitem>xnlLanguage</systemitem> resource is not set, Xt derives the language string from the <systemitem>LANG</systemitem> environment variable. <literal><indexterm><primary>XtSetLanguageProc</primary><secondary>default language</secondary></indexterm></literal></para>
<note>
<para>The per-display language string that results from this process is implementation-dependent,
and Xt provides no public means of examining the language string once it
is established.</para>
</note>
<para>By supplying its own language procedure, an application can use any
procedure it wants for setting the language string.</para>
</sect1>
<sect1 id="IPG.motif.div.3">
<title id="IPG.motif.mkr.3">Font Management</title>
<para>The desktop uses render tables to display text. A render table is a tagged
collection of renditions, each of which specifies the data used in rendering
compound strings. For information on renditions, and render tables, refer
to &MotifProgGd;.</para>
</sect1>
<sect1 id="IPG.motif.div.11">
<title id="IPG.motif.mkr.5">Drawing Localized Text<indexterm><primary>compound
strings</primary><secondary>for international text display</secondary></indexterm><indexterm>
<primary>compound strings</primary><secondary>structures, interaction with
render tables</secondary></indexterm></title>
<para>A compound string (type <classname>XmString</classname>)is a means of
encoding text so that it can be displayed in many different fonts without
changing anything in the program. A rendition, which is identified by a rendition
tag, specifies the font (and other characteristics, such as color) with which
the compound string with that rendition tag is to be rendered.</para>
<para>Especially useful for internationalization purposes are render tables,
which are collections of renditions. Among the renditions in the table may
be one tagged <systemitem class="constant">_MOTIF_DEFAULT_LOCALE</systemitem>,
which is the rendition used for the current locale. For internationalized
applications, render tables should be specified in resource files.</para>
<para>The foregoing discussion provides only a brief overview of some subjects
related to drawing localized text; for complete information, refer to &MotifProgGd;.
</para>
</sect1>
<sect1 id="IPG.motif.div.18">
<title id="IPG.motif.mkr.11">Inputting Localized Text<indexterm><primary>VendorShell widget class</primary><secondary>as input manager</secondary></indexterm><indexterm>
<primary>input method</primary><secondary>VendorShell widget class</secondary>
</indexterm><indexterm><primary>VendorShell widget class</primary><secondary>geometry management</secondary></indexterm></title>
<para>In the system environment, the <computeroutput>VendorShell</computeroutput>
widget class is enhanced to provide the interface to the input method. While
the VendorShell class controls only one child widget in its geometry management,
an extension has been added to the VendorShell class to enhance it for managing
all components necessary in the interface to an input method. These components
include the status area, preedit area, and the MainWindow area.<indexterm>
<primary>VendorShell widget class</primary><secondary>managing components</secondary><tertiary>status area</tertiary></indexterm><indexterm><primary>VendorShell widget class</primary><secondary>managing components</secondary>
<tertiary>preedit area</tertiary></indexterm><indexterm><primary>VendorShell
widget class</primary><secondary>managing components</secondary><tertiary>MainWindow area</tertiary></indexterm></para>
<para>When the input method requires a status area or a preedit area or both,
the <computeroutput>VendorShell</computeroutput> widget automatically instantiates
the status and preedit areas and manages their geometry layout. Any status
area or preedit area is managed by the <computeroutput>VendorShell</computeroutput> widget internally and is not accessible by the client. The widget
instantiated as the child of the <computeroutput>VendorShell</computeroutput>
widget is called the MainWindow area.<indexterm><primary>input method</primary>
<secondary>requirements</secondary></indexterm><indexterm><primary>international
text input</primary><secondary>input methods</secondary></indexterm><indexterm>
<primary>input method</primary><secondary>international text input</secondary>
</indexterm></para>
<para>The input method to be used by the <computeroutput>VendorShell</computeroutput>
widget is determined by the <systemitem>XmNinputMethod</systemitem> resource;
for example, <computeroutput>@im=alt</computeroutput>. The default value of
Null indicates to choose the default input method associated with the locale
at the time that VendorShell is created. As such, the user can affect which
input method is selected by either setting the locale, setting the <systemitem>XmNinputMethod</systemitem> resource, or setting both. The locale name is
concatenated with the <systemitem>XmNinputMethod</systemitem> resource to
determine the input method name. The locale name must not be specified in
this resource. The modifier name for the<systemitem>XmNinputMethod</systemitem>
resource needs to be in the form <computeroutput>@im=</computeroutput><symbol role="Variable">modifier</symbol>, where modifier is the string used to qualify
which input method is selected.<indexterm><primary>input method</primary>
<secondary>determining, XmNinputMethod resource</secondary></indexterm><indexterm>
<primary>XmNinputMethod resource, determining input method</primary></indexterm></para>
<para>The <computeroutput>VendorShell</computeroutput> widget can support
multiple widgets that can share the input method. Yet only one widget can
have the keyboard focus (for example, receive key press events and send them
to an input method) at any given time. To support multiple widgets (such
as <computeroutput>Text</computeroutput> widgets), the widgets need to be
descendants of the <computeroutput>VendorShell</computeroutput> widget.</para>
<note>
<para>The <computeroutput>VendorShell</computeroutput> widget class is a superclass
of the <computeroutput>TransientShell</computeroutput> and <computeroutput>TopLevelShell</computeroutput> widget classes. As such, an instantiation
of a <computeroutput>TopLevelShell</computeroutput> or a <computeroutput>DialogShell</computeroutput> is essentially an instantiation of a <computeroutput>VendorShell</computeroutput> widget class.</para>
</note>
<para>The <computeroutput>VendorShell</computeroutput> widget behaves as an
input manager only if one of its descendants is an <computeroutput>XmText[Field]</computeroutput> instance. As soon as an <computeroutput>XmText[Field]</computeroutput>
instance is created as a descendant of the <computeroutput>VendorShell</computeroutput>
widget, <computeroutput>VendorShell</computeroutput> creates the necessary
areas required by the particular input methods dictated by the current locale.
Even if an <computeroutput>XmText[Field]</computeroutput> instance is not
mapped but just created, VendorShell has the geometry management behavior
as described previously.</para>
<para>A <computeroutput>VendorShell</computeroutput> widget does the following:<indexterm>
<primary>international text input</primary><secondary>VendorShell widget operations</secondary></indexterm></para>
<itemizedlist remap="Bullet1"><listitem><para>Enables applications to process
multibyte character input and output that is supported by the locales installed
in the system.<indexterm><primary>VendorShell widget operations</primary>
<secondary>processing multibyte character I/O</secondary></indexterm></para>
</listitem><listitem><para>Manages an input method instance as defined in
the <computeroutput>XmIm</computeroutput> reference functions.</para>
</listitem><listitem><para>Supports preediting within a preedit area in either
OnTheSpot, OffTheSpot, OverTheSpot, Root, or None mode. Localized text can be entered
into any <computeroutput>Text</computeroutput> child widget in a multiple <computeroutput>Text</computeroutput> children widget tree by changing the focus.</para>
</listitem><listitem><para>Provides geometry management for descendant child
widgets.</para>
</listitem></itemizedlist>
<note>
<para>Input method interactions may also be customized by users through a
dialog box that is invoked from the Style Manager application. Refer to the <emphasis>CDE User's Guide</emphasis> for more information.</para>
</note>
<sect2 id="IPG.motif.div.19">
<title>Geometry Management<indexterm><primary>international text input</primary>
<secondary>geometry management</secondary></indexterm><indexterm><primary>geometry management</primary><secondary>international text input</secondary>
</indexterm></title>
<para>The <computeroutput>VendorShell</computeroutput> widget provides geometry
management and focus management for the input method's user interface components,
as necessary. If the locale warrants it (for example, if the locale is a
Japanese Extended UNIX Code (EUC) locale), the <computeroutput>VendorShell</computeroutput> widget automatically allocates and manages the geometry
of any required preedit area or status area or both.</para>
<para>Depending on the current preediting being done, an auxiliary area may
be required. If so, the <computeroutput>VendorShell</computeroutput> widget
also instantiates and manages the auxiliary area. Typically, the child of
the <computeroutput>VendorShell</computeroutput> widget is a container widget
(such as the <computeroutput>XmBulletinBoard</computeroutput> or <computeroutput>XmRowColumn</computeroutput> widgets) that can manage
multiple <computeroutput>Text</computeroutput> and <computeroutput>TextField</computeroutput> widgets, which allow multibyte character
input from the user. In this scenario, all <computeroutput>Text</computeroutput> widgets share the same input method.</para>
<note>
<para><indexterm><primary>geometry management</primary><secondary>XmBulletinBoard
widget</secondary></indexterm><indexterm><primary>geometry management</primary>
<secondary>XmRowColumn widget</secondary></indexterm><indexterm><primary>geometry management</primary><secondary>Text widget</secondary></indexterm><indexterm>
<primary>geometry management</primary><secondary>TextField widget</secondary>
</indexterm><indexterm><primary>international text input</primary><secondary>multibyte characters</secondary></indexterm><indexterm><primary>input method</primary><secondary>Text widget</secondary></indexterm><indexterm><primary>Text widgets, input method</primary></indexterm>The status, preedit, and auxiliary
areas are not accessible to the application programmer. For example, it is
not intended for the application programmer to access the window ID of the
status area. The user does not need to worry about the instantiation or management
of these components as they are managed as required by the <computeroutput>VendorShell</computeroutput> widget class.</para>
</note>
<para>The application programmer has some control over the behavior of the
input method user interface components through <systemitem>XmNpreeditType</systemitem> resources of the <computeroutput>VendorShell</computeroutput>
widget class.<indexterm><primary>geometry management</primary><secondary>application programmer controls</secondary></indexterm><indexterm><primary>application programmer, controlling input method components</primary></indexterm>
(The <literal>OffTheSpot</literal>, <literal>OnTheSpot</literal>, and <literal>OverTheSpot</literal> modes are described elsewhere in this manual).</para>
<para>Geometry management extends to all input method user interface components.
When the application program window (a <computeroutput>TopLevelShell</computeroutput>
widget) is resized, the input method user interface components are resized
accordingly, and the preedited strings in them are rearranged as required.
Of course, this assumes that the shell window has a resize policy of True.
</para>
<para>When the <computeroutput>VendorShell</computeroutput> widget is created,
if a specific input method requires a status area, preedit area, or both,
the size of the VendorShell considers the areas required by these components.
The extra areas required by the preedit and status areas are part of the <computeroutput>VendorShell</computeroutput> widget's area. They are also
managed by the <computeroutput>VendorShell</computeroutput> widget, if resizing
is necessary.</para>
<para>Because of the potential instantiation of these areas (status and preedit),
depending on the input method currently being used, the size of the <computeroutput>VendorShell</computeroutput> widget area does not necessarily grow or shrink
to accommodate exactly the size of its child. The size of the <computeroutput>VendorShell</computeroutput> widget area grows or shrinks to accommodate
both its child's geometry <emphasis>and</emphasis> the geometry of these
input method user interface areas. There may be a difference (for example,
of 20 pixels) in height between the <computeroutput>VendorShell</computeroutput>
widget and its child widget (the MainWindow area). The width geometry is <symbol role="Variable">not</symbol> affected by the input method user interface
components.<indexterm><primary>VendorShell widget class</primary><secondary>size</secondary></indexterm><indexterm><primary>VendorShell widget class</primary><secondary>child widget size</secondary></indexterm></para>
<para>In summary, the requested size of the child is honored if possible;
the actual size of the <computeroutput>VendorShell</computeroutput> may be
larger than its child.</para>
<para>The requests to specify the geometry of the <computeroutput>VendorShell</computeroutput> widget and its child are honored as long as they do not
conflict with each other or are within the constraint of the <computeroutput>VendorShell</computeroutput> widget's ability to resize. When they do conflict,
the child's widget geometry request has higher precedence. For example, if
the size of the child widget is specified as 100x100, the size of VendorShell
is also specified as 100x100. The resulting VendorShell has a size of 100x120,
while its child widget gets a size of 100x100. If the size of the child widget
is not specified, the VendorShell shrinks its child widget if necessary to
honor its own size specification. For example, if the size of VendorShell
is specified as 100x100 and no size is specified for its child, the child
widget has a size of 100x80. If the <computeroutput>VendorShell</computeroutput> widget is disabled from resizing, regardless of what the geometry
request of its child is, the <computeroutput>VendorShell</computeroutput>
widget honors only its own geometry specification.</para>
</sect2>
<sect2 id="IPG.motif.div.20">
<title>Focus Management<indexterm><primary>international text input</primary>
<secondary>focus management</secondary></indexterm><indexterm><primary>VendorShell
widget class</primary><secondary>focus management</secondary></indexterm><indexterm>
<primary>focus management</primary><secondary>international text input</secondary>
</indexterm></title>
<para>Languages with large numbers of characters (such as Japanese and Chinese)
require an input method that allows the user to compose characters in that
language interactively.<indexterm><primary>input method</primary><secondary>multibyte characters</secondary></indexterm> This is because, for these languages,
there are many more characters than can be reasonably mapped to a terminal
keyboard.</para>
<para>The interactive process of composing characters in such languages is
called <emphasis>preediting</emphasis>. The preediting itself is handled
by the input method. However, the user interface of the preediting is determined
by the system environment. An interface needs to exist between the input
method and the system environment. This is done through the <computeroutput>VendorShell</computeroutput> widget of the system environment.<indexterm>
<primary>preediting</primary></indexterm><indexterm><primary>VendorShell
widget class</primary><secondary>as interface</secondary>
</indexterm><indexterm><primary>interfaces</primary><secondary>between input method and Common Desktop Environment</secondary>
</indexterm><indexterm><primary>input method</primary><secondary>Common Desktop
Environment interface</secondary></indexterm><indexterm><primary>Common Desktop
Environment</primary><secondary>input method interface</secondary></indexterm><indexterm>
<primary>preediting</primary></indexterm></para>
<para><!--Original XRef content: 'Figure&numsp;4&hyphen;3'--><xref role="CodeOrFigureOrTable"
linkend="IPG.motif.mkr.12"> illustrates a case with Japanese preediting. The
string shown in reverse video is the string in preediting. This string can
be moved across different windows by giving focus to the particular window.
However, only one preediting session can occur at one time.</para>
<figure>
<title id="IPG.motif.mkr.12">Japanese preediting example</title>
<graphic id="IPG.motif.grph.3" entityref="IPG.motif.fig.3"></graphic>
</figure>
<para>For an example of focus management, suppose a <computeroutput>TopLevelShell</computeroutput> widget (a subclass of the <computeroutput>VendorShell</computeroutput>
widget) has an <computeroutput>XmBulletinBoard</computeroutput> widget child
(MainWindow area), which has five <computeroutput>XmText</computeroutput>
widgets as children. Assume the locale requires the preedit area, and assume
the OverTheSpot mode is specified. Because the <computeroutput>VendorShell</computeroutput> widget manages only one instance of an input method, you
can run only one preedit area at a time inside the <computeroutput>TopLevelShell</computeroutput> widget. If the focus is moved from one <computeroutput>Text</computeroutput> widget to another, the current preedit string under
construction is also moved on top of the <computeroutput>Text</computeroutput>
widget that currently has focus. Processing of keys to the old <computeroutput>Text</computeroutput> widget is suspended temporarily. Subsequent interface
of the input method, such as the delivery of the string at preedit completion,
is made to the new, focused <computeroutput>Text</computeroutput> widget.<indexterm>
<primary>focus management</primary><secondary>example description</secondary>
</indexterm></para>
<para>The string being preedited can be moved to the location of the focus;
for example, by clicking the mouse.</para>
<para>A string that the end user is finished preediting and that is already
confirmed <emphasis>cannot</emphasis> be reconverted. Once the string is
composed, it is committed. Refer to Chapter 11 of &MotifProgGd; for information
on actions that cause a preedit string to be committed.</para>
</sect2>
</sect1>
<sect1 id="IPG.motif.div.21">
<title id="IPG.motif.mkr.13">Internationalized User Interface Language<indexterm>
<primary>National Language Support</primary><secondary>User Interface Language
(UIL)</secondary></indexterm><indexterm><primary>User Interface Language
(UIL), see UIL &lt;$nopage></primary>
</indexterm></title>
<para>The capability to parse a multibyte character string as a string literal
has been added to the User Interface Language (UIL).<indexterm><primary>programming for international use</primary><secondary>UIL</secondary><tertiary>parsing multibyte character string</tertiary></indexterm> Creation of a UIL
file is performed by using the characteristics of the target language and
writing the User Interface Definition (UID) file.</para>
<sect2 id="IPG.motif.div.22">
<title>Programming for Internationalized User Interface Language<indexterm>
<primary>locales</primary><secondary>UIL compiler</secondary></indexterm></title>
<para>The UIL compiler parses nonstandard charsets as locale text. This requires
the UIL compiler to be run in the same locale as any locale text.<indexterm>
<primary>programming for international use</primary><secondary>UIL</secondary>
<tertiary>parsing nonstandard charsets</tertiary></indexterm><indexterm>
<primary>programming for international use</primary><secondary>UIL</secondary>
<tertiary>locale text</tertiary></indexterm></para>
<para>If the locale text of a widget requires a font set (more than one font),
the fonts must be specified within the resource file as a render table.</para>
<para>To use a specific language with UIL, a UIL file is written according
to characteristics of the target language and compiled into a UID file. The
UIL file that contains localized text needs to be compiled in the locale
in which it is to run.</para>
<sect3 id="IPG.motif.div.23">
<title>String Literals<indexterm><primary>programming for international use</primary><secondary>UIL</secondary><tertiary>string literals</tertiary></indexterm><indexterm>
<primary>string literals</primary><secondary>programming for international
UIL</secondary></indexterm></title>
<para>The following shows examples of literal strings. The cur_charset value
is always set to the default_charset value, which allows the string literal
to contain locale text.</para>
<para>To set locale text in the string literal with the default_charset value,
enter the following:</para>
<programlisting>XmNlabelString = '<symbol>XXXXXX</symbol>';</programlisting>
<para>OR</para>
<programlisting>XmNlabelString = #default_charset&ldquo;<symbol>XXXXXX</symbol>&rdquo;;
</programlisting>
<para>Compile the UIL file with the <systemitem>LANG</systemitem> environment
variable matching the encoding of the locale text. Otherwise, the string
literal is not compiled properly.</para>
</sect3>
<sect3 id="IPG.motif.div.24">
<title>Font<indexterm><primary>font sets</primary><secondary>programming
for international UIL</secondary></indexterm> Sets<indexterm><primary>programming
for international use</primary><secondary>UIL</secondary></indexterm><indexterm>
<primary>programming for international UIL</primary></indexterm><indexterm>
<primary>font sets</primary><secondary>programming for international UIL</secondary>
</indexterm></title>
<para>The font set cannot be set through UIL source programming. Whenever
the font set is required, you must set it in the resource file as render
table resource. Refer to &MotifProgGd; for more information.</para>
</sect3>
<sect3 id="IPG.motif.div.25">
<title>Font Lists<indexterm><primary>font lists in UIL, specifying resources
for</primary></indexterm></title>
<para>Like font sets, font lists are specified in resource files as render
tables. Refer to &MotifProgGd; for detailed information.</para>
</sect3>
<sect3 id="IPG.motif.div.25a">
<title>Render Tables<indexterm><primary>render tables in UIL, specifying resources
for</primary></indexterm></title>
<para>Render tables, as well as renditions, tab lists,
and tab stops, are implemented as a special class of objects.
Refer to &MotifProgGd; for detailed information.</para>
</sect3>
<sect3 id="IPG.motif.div.26">
<title>Creating<indexterm><primary>resource files</primary><secondary>creating
for international UIL</secondary></indexterm> Resource Files<indexterm><primary>programming for international use</primary><secondary>UIL</secondary></indexterm><indexterm>
<primary>programming for international UIL</primary></indexterm><indexterm>
<primary>resource files, creating</primary></indexterm><indexterm><primary>resource files</primary><secondary>creating for international UIL</secondary>
</indexterm></title>
<para>If necessary, set the input method-related resources in the resource
file as shown in the following example:</para>
<programlisting>*preeditType: OverTheSpot, OnTheSpot, OffTheSpot, Root, or None
</programlisting>
</sect3>
<sect3 id="IPG.motif.div.27">
<title>Setting the<indexterm><primary>setting the environment</primary><secondary>for international UIL</secondary></indexterm> Environment<indexterm><primary>programming for international use</primary><secondary>UIL</secondary></indexterm><indexterm>
<primary>programming for international UIL</primary></indexterm><indexterm>
<primary>setting the environment</primary><secondary>for international UIL</secondary></indexterm></title>
<para>For a locale-sensitive application, set the UID file to the appropriate
directory. Set the <systemitem>UIDPATH</systemitem> or <systemitem>XAPPLRESDIR</systemitem> environment variable to the appropriate value.</para>
<para>For example, to run the <computeroutput>uil_sample</computeroutput>
program with an English environment (<systemitem>LANG</systemitem> environment
variable is <computeroutput>en_US</computeroutput>), set <filename>uil_sample.uid</filename> with Latin characters at the <filename>$HOME/en_US</filename>
directory, or set <filename>uil_sample.uid</filename> to a directory and
set the <systemitem>UIDPATH</systemitem> environment variable to the full
path name of the <filename>uil_sample.uid</filename> file.</para>
<para>To run the <computeroutput>uil_sample</computeroutput> program with
a Japanese environment (<systemitem>LANG</systemitem> environment variable
is <systemitem>ja_JP</systemitem>), create a <filename>uil_sample.uid</filename>
file with Japanese (multibyte) characters at the <filename>$HOME/ja_JP</filename>
directory, or place <filename>uil_sample.uid</filename> to a unique directory
and set the <systemitem>UIDPATH</systemitem> environment variable to the
full path name of the <filename>uil_sample.uid</filename> file. The following
list specifies the possible variables:</para>
<informaltable>
<tgroup cols="2" colsep="0" rowsep="0">
<colspec align="left" colwidth="100*">
<colspec align="left" colwidth="356*">
<tbody>
<row>
<entry><para><emphasis>%U</emphasis></para></entry>
<entry><para>Specifies the UID file string.</para></entry></row>
<row>
<entry><para><emphasis>%N</emphasis></para></entry>
<entry><para>Specifies the class name of the application.</para></entry></row>
<row>
<entry><para><emphasis>%L</emphasis></para></entry>
<entry><para>Specifies the value of the <computeroutput>xnlLanguage</computeroutput>
resource or <filename>LC_CTYPE</filename> category.</para></entry></row>
<row>
<entry><para><emphasis>%l</emphasis></para></entry>
<entry><para>Specifies the language component of the <computeroutput>xnlLanguage</computeroutput> resource or the <computeroutput>LC_CTYPE</computeroutput>
category.</para></entry></row></tbody></tgroup></informaltable>
<para>If the <systemitem>XAPPLRESDIR</systemitem> environment variable is
set, the <computeroutput>MrmOpenHierarchy()</computeroutput> function searches
the UID file in the following order:<indexterm><primary>setting the environment</primary><secondary>searching the UID file</secondary></indexterm><indexterm>
<primary>UID file search</primary></indexterm><indexterm><primary>MrmOpenHierarchy
function, searching UID file</primary></indexterm><indexterm><primary>UID
file search</primary></indexterm></para>
<orderedlist><listitem><para>UID file path name</para>
</listitem><listitem><para><computeroutput>$UIDPATH</computeroutput></para>
</listitem><listitem><para>%U</para>
</listitem><listitem><para><computeroutput>$XAPPLRESDIR/</computeroutput> <emphasis>%L</emphasis><computeroutput>/uid/</computeroutput><emphasis>%N</emphasis> <computeroutput>/</computeroutput><emphasis>%U</emphasis></para>
</listitem><listitem><para><computeroutput>$XAPPLRESDIR/</computeroutput> <emphasis>%l</emphasis><computeroutput>/uid/</computeroutput><emphasis>%N</emphasis> <computeroutput>/</computeroutput><emphasis>%U</emphasis></para>
</listitem><listitem><para><computeroutput>$XAPPLRESDIR/uid/</computeroutput> <emphasis>%N</emphasis><computeroutput>/</computeroutput><emphasis>%U</emphasis></para>
</listitem><listitem><para><computeroutput>$XAPPLRESDIR/</computeroutput> <emphasis>%L</emphasis><computeroutput>/uid/</computeroutput><emphasis>%U</emphasis></para>
</listitem><listitem><para><computeroutput>$XAPPLRESDIR/</computeroutput> <emphasis>%l</emphasis><computeroutput>/uid/</computeroutput><emphasis>%U</emphasis></para>
</listitem><listitem><para><computeroutput>$XAPPLRESDIR/uid/</computeroutput> <emphasis>%U</emphasis></para>
</listitem><listitem><para><computeroutput>$HOME/uid/</computeroutput> <emphasis>%U</emphasis></para>
</listitem><listitem><para><computeroutput>$HOME/</computeroutput> <emphasis>%U</emphasis></para>
</listitem><listitem><para><computeroutput>/usr/lib/X11/</computeroutput> <emphasis>%L</emphasis><computeroutput>/uid/</computeroutput><emphasis>%N</emphasis> <computeroutput>/</computeroutput><emphasis>%U</emphasis></para>
</listitem><listitem><para><computeroutput>/usr/lib/X11/</computeroutput> <emphasis>%l</emphasis><computeroutput>/uid/</computeroutput><emphasis>%N</emphasis> <computeroutput>/</computeroutput><emphasis>%U</emphasis></para>
</listitem><listitem><para><computeroutput>/usr/lib/X11/uid/</computeroutput> <emphasis>%N</emphasis><computeroutput>/</computeroutput><emphasis>%U</emphasis></para>
</listitem><listitem><para><computeroutput>/usr/lib/X11/</computeroutput> <emphasis>%L</emphasis><computeroutput>/uid/</computeroutput><emphasis>%U</emphasis></para>
</listitem><listitem><para><computeroutput>/usr/lib/X11/</computeroutput> <emphasis>%l</emphasis><computeroutput>/uid/</computeroutput><emphasis>%U</emphasis></para>
</listitem><listitem><para><computeroutput>/usr/lib/X11/uid/</computeroutput> <emphasis>%U</emphasis></para>
</listitem><listitem><para><computeroutput>/usr/include/X11/uid/</computeroutput> <emphasis>%U</emphasis></para>
</listitem></orderedlist>
<para>If the <systemitem>XAPPLRESDIR</systemitem> environment variable is
not set, the <computeroutput>MrmOpenHierarchy()</computeroutput> function
uses <computeroutput>$HOME</computeroutput> instead of the<systemitem>XAPPLRESDIR</systemitem> environment variable.</para>
</sect3>
</sect2>
<sect2 id="IPG.motif.div.28">
<title>default_charset Character Set in UIL<indexterm><primary>User Interface
Language (UIL), see UIL &lt;$nopage></primary></indexterm><indexterm><primary>default_charset string literal</primary></indexterm><indexterm><primary>string literals</primary><secondary>default_charset in UIL</secondary></indexterm><indexterm><primary>default_charset
string literal</primary></indexterm></title>
<para>With the default_charset string literal, any characters can be set as
a valid string literal. For example, if the <systemitem>LANG</systemitem>
environment variable is <computeroutput>el_GR</computeroutput>, the string
literal with default_charset can contain any Greek character. If the <systemitem>LANG</systemitem> environment variable is <computeroutput>ja_JP</computeroutput>,
the default_charset string literal can contain any Japanese character encoded
in Japanese EUC.</para>
<para>If no character set is set to a string literal, the character set of
the string literal is set as cur_charset. And, in the system environment,
the cur_charset value is always set as default_charset.</para>
<sect3 id="IPG.motif.div.29">
<title>Example<indexterm><primary>UIL (User Interface Language)</primary>
<secondary>sample Japanese and English program</secondary></indexterm>: uil_sample</title>
<para><!--Original XRef content: 'Figure&numsp;4&hyphen;4'--><xref role="CodeOrFigureOrTable"
linkend="IPG.motif.mkr.14"> shows a UIL sample program on English and Japanese
environments.</para>
<figure>
<title id="IPG.motif.mkr.14">Sample UIL program on English and Japanese environments</title>
<graphic id="IPG.motif.grph.4" entityref="IPG.motif.fig.4"></graphic>
</figure>
<para>In the following sample program, <emphasis>LLL</emphasis> indicates
locale text, which can be Japanese, Korean, Traditional Chinese, Greek, French,
or others.</para>
<programlisting>uil_sample.uil
!
! sample uil file - uil_sample.uil
!
! C source file - uil_sample.c
!
! Resource file - uil-sample.resource
!
module Test
version = 'v1.0'
names = case_sensitive
objects = {
XmPushButton = gadget;
}
!************************************
! declare callback procedure
!************************************
procedure
exit_CB;
!***************************************************************
! declare BulletinBoard as parent of PushButton and Text
!***************************************************************
object
bb: XmBulletinBoard {
arguments{
XmNwidth = 500;
XmNheight = 200;
};
controls{
XmPushButton pb1;
XmText text1;
};
};
!****************************
! declare PushButton
!****************************
object
pb1: XmPushButton {
arguments{
XmNlabelString = #Normal &ldquo;<emphasis>LLL</emphasis>exit button <emphasis>LLL</emphasis>&rdquo;;
XmNx = 50;
XmNy = 50;
};
callbacks{
XmNactivateCallback = procedure exit_CB;
};
};
!*********************
! declare Text
!*********************
text1: XmText {
arguments{
XmNx = 50;
XmNy = 150;
};
};
end module;
*
* C source file - uil_sample.c
*
*/
#include &lt;Mrm/MrmAppl.h>
#include &lt;locale.h>
void exit_CB();
static MrmHierarchy hierarchy;
static MrmType *class;
/******************************************/
/* specify the UID hierarchy list */
/*****************************************/
static char *aray_file[]=
{&ldquo;uil_sample.uid&rdquo;
};
static int num_file = (sizeof aray_file / sizeof
aray_file[0]);
/******************************************************/
/* define the mapping between UIL procedure names */
/* and their addresses */
/******************************************************/
static MRMRegisterArg reglist[]={
{&ldquo;exit_CB&rdquo;,(caddr_t) exit_CB}</programlisting>
</sect3>
</sect2>
<sect2 id="IPG.motif.div.30">
<title>Compound Strings in UIL<indexterm><primary>compound strings</primary>
<secondary>in UIL</secondary></indexterm></title>
<para>Three mechanisms exist for specifying strings in UIL files:</para>
<itemizedlist remap="Bullet1"><listitem><para>As string literals, which may
be stored in UID files as either null-terminated strings or compound strings
</para>
</listitem><listitem><para>As compound strings</para>
</listitem><listitem><para>As wide character strings</para>
</listitem></itemizedlist>
<para>Both string literals and compound strings consist of text, a character
set, and a writing direction. For string literals and for compound strings
with no explicit direction, UIL infers the writing direction from the character
set. The UIL concatenation operator (&amp;) concatenates both string literals
and compound strings.</para>
<para>Regardless of whether UIL stores string literals in UID files as null-terminated
strings or as compound<indexterm><primary>string literals</primary><secondary>in UID files</secondary></indexterm> strings, it stores information about
each string's character set and writing direction along with the text. In
general, UIL stores string literals or string expressions as compound strings
in UID files under the following conditions:</para>
<itemizedlist remap="Bullet1"><listitem><para>When a string expression consists
of two or more literals with different character sets or writing directions
</para>
</listitem><listitem><para>When the literal or expression is used as a value
that has a compound string data type (such as the value of a resource whose
data type is compound string)</para>
</listitem></itemizedlist>
<para>UIL recognizes a number of keywords specifying character sets. UIL associates
parsing<indexterm><primary>character set keywords</primary></indexterm><indexterm>
<primary>character sets, defining with UIL CHARACTER_SET function</primary>
</indexterm><indexterm><primary>string literals</primary><secondary>syntax</secondary></indexterm> rules, including parsing direction and whether characters
have 8 or 16 bits, for each character set it recognizes. It is also possible
to define a character set using the UIL <command>CHARACTER_SET</command> function.
</para>
<para>The syntax of a string literal is one of the following:</para>
<itemizedlist remap="Bullet1"><listitem><para>'[<symbol role="Variable">character_string</symbol>]'</para>
</listitem><listitem><para>[#<symbol role="Variable">char_set</symbol>]</para>
</listitem><listitem><para>&ldquo;[<symbol role="Variable">character_string</symbol>]&rdquo;</para>
</listitem></itemizedlist>
<para>For each syntax, the character set of the string is determined as follows:
</para>
<itemizedlist remap="Bullet1"><listitem><para>For a string declared as ' <symbol>character_string</symbol>', the character set is the code set component of
the <systemitem>LANG</systemitem> environment variable, if it is set in the
UIL compilation environment; or it is the value of <computeroutput>XmFALLBACK_CHARSET</computeroutput> if the <systemitem>LANG</systemitem> environment variable
is not set or has no code set. By default, the value of <computeroutput>XmFALLBACK_CHARSET</computeroutput> is ISO8859-1, but vendors may supply
different values.</para>
</listitem><listitem><para>For a string declared as <literal>#</literal><symbol role="Variable">char_set</symbol> <literal>&ldquo;</literal> <symbol role="Variable">string</symbol><literal>&rdquo;</literal>, the character set is <symbol role="Variable">char_set</symbol>.</para>
</listitem><listitem><para>For a string declared as <literal>&ldquo;</literal><symbol role="Variable">character</symbol><literal>_</literal> <symbol role="Variable">string</symbol><literal>&rdquo;</literal>, the character set depends on whether
the module has a <filename>CHARACTER_SET</filename> clause and whether the
UIL compiler's <filename>use_setlocale_flag</filename> is set.</para>
<itemizedlist remap="Bullet2"><listitem><para>If the module has a <filename>CHARACTER_SET</filename> clause, the character set is the one specified in
that clause.</para>
</listitem><listitem><para>If the module has no <filename>CHARACTER_SET</filename>
clause but the <command>uil</command> command was started with the <computeroutput>-s</computeroutput> option, or if the <filename>Uil()</filename> function
was started with <command>use_setlocale_flag set</command>, UIL calls the <filename>setlocale()</filename> function and parses the string
in the current locale. The character set of the resulting string is <computeroutput>XmFONTLIST_DEFAULT_TAG</computeroutput>.</para>
</listitem><listitem><para>If the module has no <filename>CHARACTER_SET</filename>
clause and the <computeroutput>uil</computeroutput> command was started without
the <computeroutput>-s</computeroutput> option, or if the <filename>Uil()</filename> function was started without <filename>use_setlocale_flag</filename>, the character set is the code set component of the <systemitem>LANG</systemitem> environment variable, if it is set in the UIL compilation
environment, or the character set is the value of <computeroutput>XmFALLBACK_CHARSET</computeroutput> if <systemitem>LANG</systemitem> is not set or has no code
set.</para>
</listitem></itemizedlist>
</listitem></itemizedlist>
<para>UIL always stores a string specified using the <computeroutput>COMPOUND_STRING</computeroutput> function as a compound string. This function takes as arguments
a string expression and optional specifications of a character set, direction,
and whether to append a separator to the string. If no character set or direction
is specified, UIL derives it from the string expression, as described in
the preceding section.</para>
<note>
<para>Certain predefined escape sequences, beginning with a \ (backslash),
may be displayed in string literals, with the following exceptions: &ndash;
A string in single quotation marks can span multiple lines, with each
new line character escaped by a backslash. A string in double quotation
marks cannot span multiple lines. &ndash; Escape sequences are processed
literally inside a string that is parsed in the current locale (a localized
string).</para>
</note>
</sect2>
</sect1>
</chapter>
<!--fickle 1.14 mif-to-docbook 1.7 01/02/96 04:19:51-->

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -0,0 +1,139 @@
<!-- $XConsortium: preface.sgm /main/10 1996/08/25 15:12:06 rws $ -->
<!-- (c) Copyright 1995 Digital Equipment Corporation. -->
<!-- (c) Copyright 1995 Hewlett-Packard Company. -->
<!-- (c) Copyright 1995 International Business Machines Corp. -->
<!-- (c) Copyright 1995 Sun Microsystems, Inc. -->
<!-- (c) Copyright 1995 Novell, Inc. -->
<!-- (c) Copyright 1995 FUJITSU LIMITED. -->
<!-- (c) Copyright 1995 Hitachi. -->
<preface id="IPG.Pref.div.1">
<title>Preface</title>
<para>The <emphasis>Common Desktop Environment: Internationalization Programmer's
Guide</emphasis> provides information for internationalizating the desktop,
enabling applications to support various languages and cultural conventions
in a consistent user interface.</para>
<para>Specifically, this guide:</para>
<itemizedlist remap="Bullet1"><listitem><para>Provides guidelines and hints
for developers on how to write applications for worldwide distribution.</para>
</listitem><listitem><para>Provides an overall view of internationalization
topics that span different layers within the desktop.</para>
</listitem><listitem><para>Provides pointers to reference and more detailed
documentation. In some cases, standard documentation is referenced.</para>
</listitem></itemizedlist>
<para>This guide is not intended to duplicate the existing reference or conceptual
documentation but rather to provide guidelines and conventions on specific
internationalization topics. This document focuses on internationalization
topics and not on any specific component or layer in an open software environment.
</para>
<sect1 id="IPG.Pref.div.2">
<title>Who Should Use This Book</title>
<para>This book provides various levels of information for the application
programmer and developer and related fields.</para>
</sect1>
<sect1 id="ipg.pref.div.3">
<title>How This Book Is Organized</title>
<para>Explanations of the contents of this book follow:</para>
<para><emphasis role="Lead-in">Chapter 1, &ldquo;Introduction to Internationalization,&rdquo;</emphasis> provides an overview of internationalization and localizing within
the desktop, including locales, fonts, drawing, inputting, interclient communication,
and extracting user visual text. Information on the significance of internationalization
standards is also provided.</para>
<para><emphasis role="Lead-in">Chapter 2, &ldquo;Internationalization and
the Common Desktop Environment,&rdquo;</emphasis> covers the set of topics
that developers commonly need to consider when internationalizing their applications,
including locale management, localized resources, font management, localized
text tasks, interclient communication for localized text, and internationalized
functions.</para>
<para><emphasis role="Lead-in">Chapter 3, &ldquo;Internationalization and
Distributed Networks,&rdquo;</emphasis> discusses topics related to handling
encoded characters in distributed networks. Basic principles and examples
for interclient interoperability are provided to guide developers in internationalized
distributed environments.</para>
<para><emphasis role="Lead-in">Chapter 4, &ldquo;Xt, Xlib, and Motif Dependencies,&rdquo;</emphasis> topics include internationalized applications, locale management,
localized text, international User Interface Language (UIL), and localized
applications.</para>
<para><emphasis role="Lead-in">Appendix A,&ldquo;Message Guidelines,&rdquo;</emphasis> is a set of guidelines for writing messages.</para>
</sect1>
<sect1 id="IPG.Pref.div.4">
<title>Related Publications</title>
<para>See the following documentation for additional information on topics
presented in this book:</para>
<itemizedlist remap="Bullet1"><listitem><para>ISO C: ISO/IEC 9899:1990, <emphasis>Programming Languages --- C</emphasis> (technically identical to ANS X3.159-1989,
Programming Language C).</para>
</listitem><listitem><para>ISO/IEC 9945-1: 1990, (IEEE Standard 1003.1) <emphasis>Information Technology - Portable Operating System Interface (POSIX) - Part
1: System Application Program Interface (API) [C Language</emphasis>].</para>
</listitem><listitem><para>ISO/IEC DIS 9945-2: 1992, (IEEE Standard 1003.2-Draft) <emphasis>Information Technology - Portable Operating System Interface (POSIX) - Part
2: Shell and Utilities</emphasis>.</para>
</listitem><listitem><para>Motif: <emphasis>Motif Programmer's
Reference</emphasis>, <emphasis>Revision 1.2</emphasis>, Open Software Foundation,
Prentice Hall, 1992, ISBN: 0-13-643115-1.</para>
</listitem><listitem><para>Scheifler, W. R., <emphasis>X Window System, The
Complete Reference to Xlib, Xprotocol, ICCCM, XLFD</emphasis> - X Version
11, Release 5, Digital Press, 1992, ISBN: 1-55558- 088-2.</para>
</listitem><listitem><para>X/Open: <emphasis>X/Open CAE Specification System
Interface Definition</emphasis>, Issue 4, X/Open Company Ltd., 1992, ISBN:
1-872630-46-4.</para>
</listitem><listitem><para>X/Open: <emphasis>X/Open CAE Specification Commands
and Utilities</emphasis>, Issue 4, X/Open Company Ltd., 1992, ISBN: 1-872630-48-0.
</para>
</listitem><listitem><para>X/Open: <emphasis>X/Open CAE Specification System
Interface and Headers</emphasis>, Issue 4, X/Open Company Ltd., 1992, ISBN:
1-872630-47-2.</para>
</listitem><listitem><para>X/Open: <emphasis>X/Open Internationalization Guide</emphasis>, X/Open Company Ltd., 1992, ISBN: 1-872630-20-0.</para>
</listitem><listitem><para>ISO/IEC 10646-1:1993 (E): <emphasis>Information
Technology - Universal Multi-Octet Coded Character Set (UCS). Part 1: Architecture
and Basic Multilingual Plane</emphasis>.</para>
</listitem></itemizedlist>
</sect1>
<sect1 id="IPG.Pref.div.5">
<title>What DocBook SGML Markup Means</title>
<para>This book is written in the Structured Generalized Markup Language (SGML)
using the DocBook Document Type Definition (DTD). The following table
describes the DocBook markup used for various semantic elements.
</para>
<table id="ipg.pref.tbl.1" frame="topbot">
<title id="ipg.pref.mkr.1">DocBook SGML Markup</title>
<tgroup cols="3" colsep="0" rowsep="0">
<colspec colname="1" colwidth="1.2 in">
<colspec colname="2" colwidth="1.89 in">
<colspec colname="3" colwidth="2.23 in">
<thead>
<row>
<entry><para><literal>Markup Appearance</literal></para></entry>
<entry><para><literal>Semantic Element(s)</literal></para></entry>
<entry><para><literal>Example</literal></para></entry>
</row>
</thead>
<tbody>
<row>
<entry><para><command>AaBbCc123</command></para></entry>
<entry><para>The names of commands.</para></entry>
<entry><para>Use the <command>ls</command> to list files.</para></entry>
</row>
<row>
<entry><para><literal>AaBbCc123</literal></para></entry>
<entry><para>The names of command options.</para></entry>
<entry><para>Use <command>ls</command><literal>&minus;a</literal> to list all files.</para></entry>
</row>
<row>
<entry><para><symbol role="Variable">AaBbCc123</symbol></para></entry>
<entry><para>Command-line placeholder: replace with a real name or value.</para></entry>
<entry><para>To delete a file, type <command>rm</command>
<symbol role="Variable">filename</symbol>.</para></entry>
</row>
<row>
<entry><para><filename>AaBbCc123</filename></para></entry>
<entry><para>The names of files and directories.</para></entry>
<entry><para>Edit your <filename>.login</filename> file.</para></entry>
</row>
<row>
<entry><para><emphasis>AaBbCc123</emphasis></para></entry>
<entry><para>Book titles, new words or terms, or words to be emphasized.</para></entry>
<entry><para>Read Chapter 6 in <emphasis>User's Guide</emphasis>. These are called <emphasis>class</emphasis> options. You <emphasis>must</emphasis> be root to do this.</para></entry>
</row>
</tbody>
</tgroup>
</table>
</sect1>
</preface>
<!--fickle 1.14 mif-to-docbook 1.7 01/02/96 04:19:51-->