]]>
dtsrloaduser cmd
dtsrloadLoad
document objects in a database
dtsrload
−ddbname
−c
−tetxstr
−h0
−hhashsz
−ehufname
−pdotcnt
file
DESCRIPTION
dtsrload loads document header information and, in
AusText type databases, documents themselves into a DtSearch database.
The input is a file of one or more documents in a simple canonical
format (fzk file). An fzk file can be generated by
dtsrhan manually with a text editor, or by a special
application program created for the purpose. Typically the same fzk file
is used for dtsrload and
dtsrindex, but it is not required and there are
situations where it may not be desirable. (See
&cdeman.dtsrfzkfiles; for information about DtSearch fzk files).
dtsrload also maintains the current total document
count in the database's configuration and status record.
If a document's unique key in the fzk file does not preexist in the
database, dtsrload considers the document to be new
and does not add it as a new document. If the document's key already
exists in the database, dtsrload totally replaces its
record with the one in the fzk file. When duplicate record ids are
encountered in a single fzk file, only the first occurrence of the
document is loaded into the database, the second one is discarded.
Duplicate record ids are maintained during execution with a hash table.
dtsrload also performs a data compression function for
documents that are actually stored in a database repository (that is,
AusText type databases). In order to do this an encode
compression huf file must be available.
(See &cdeman.huffcode; for information about DtSearch document compression.)
dtsrload also performs a data compression function for
documents that are actually stored in a database repository (that is,
AusText type databases). In order to do this an encode
compression huf file must be available.
(See &cdeman.huffcode; for information about DtSearch document compression.)
dtsrload does not index the words used to access the
database. This is done by dtsrindex. To prevent
database link corruption, execute dtsrindex
immediately after dtsrload.
To prevent database corruption, execute dtsrload only
after all users of a preexisting database have exited their search
programs to prevent database corruption. For a single fzk file,
dtsrload must be executed immediately before
dtsrindex so that dtsrindex can
map the words it indexes to the correct internal database addresses.
Only after both programs successfully complete execution may users again
be allowed to perform online searches of the database.
OPTIONS
The following options are available:
If an option takes a value, the value must be directly appended to
the option name without white space.
−ddbname
Specifies the 1 to 8 ASCII character name of the database to be
updated.
If an optional directory path is not prepended to the database
name, dtsrload will attempt to open the database from
the current working directory. File name extensions for database
files are automatically appended.
−c
Instructs dtsrload to initialize the database total
document count by counting existing records before loading the current
batch. This option is usually not required.
−tetxstr
Specifies the end of document text delimiter string. The default
document separator in an fzk file is an ASCII form feed character
followed by an ASCII line feed ('\f\n'). For certain multibyte languages
it may be more convenient to specify a nonASCII string as the document
delimiter.
−h0
Instructs dtsrload to not check for duplicate
record ids. This option should not be specified unless it
is certain that there are no duplicate ids in the fzk file.
−hhashsz
Sets the duplicate record id hash table size to
hashsz. The default is 3000.
dtsrload will execute more efficiently if the
specified table size is larger than the number of documents in the fzk
file.
−ehufname
Sets the compression encode file name to
hufname. The default is
ophuf.huf. The file name can include a path prefix.
This option is ignored unless the database type is AusText.
−pdotcount
Instructs dtsrload to print a progress character to
stdout for every dotcount documents
processed. The default is 20.
OPERANDS
The required input file name (file)
identifies the file to be processed by dtsrload. It
can optionally include a path prefix, either from root or relative to
the current working directory. If a file name extension is not
specified, dtsrload assumes a default extension of
.fzk.
ENVIRONMENT VARIABLES
None.
RESOURCES
None.
ACTIONS/MESSAGES
None.
RETURN VALUES
The return values are as follows:
0
dtsrload completed successfully.
1
dtsrload successfully
recovered from an error. This occurs when one or more
documents were discarded because of a partially invalid
fzk file format, duplicate record ids, or empty record text.
>1
dtsrload encountered a fatal error.
FILES
dtsrload reads the specified fzk file and opens
all the database and related language files for the specified
database name.
For AusText type databases, it also reads the compression encode file
ophuf.huf.
dtsrload updates the following database files:
dbname.d00
dbname.d01
dbname.k00
dbname.k01
EXAMPLES
Load database mydb with the documents specified in
the fzk file named batch1.fzk in the current
working directory.
dtsrload -dmydb batch1
Load database mydb with the documents specified in
the fzk file /u/dtsearch/jpndocs.1. Three ASCII
plus signs at the bottom of each document signals the end of document
text and the beginning of the next fzk file record.
dtsrload -dmydb -t+++ /u/dtsearch/jpndocs.1
SEE ALSO
&cdeman.dtsrhan;,
&cdeman.dtsrindex;,
&cdeman.huffcode;,
&cdeman.dtsrfzkfiles;,
&cdeman.DtSearch;