]]>
DtSrAPI
library call
DtSrAPI
Describes overview, constants, and structures
for DtSearch online API
DESCRIPTION
The DtSearch API provides programmatic access to the DtSearch search and
retrieval engine. The API functions are located in the library
libDtSr, and are directly linked into user written
search programs.
Search and retrieval of DtSearch databases is available through three
essential API functions:
DtSearchInit
Opens databases and other files, and generally initializes the search
engine for subsequent requests.
DtSearchQuery
Is passed a user query and some
search options, performs the requested search, and returns a linked list of
structures, called a results list, representing the objects satisfying the
search. The results list contains abstracted information about the documents
suitable for display to an end user, as well as private information used for
subsequent retrievals.
DtSearchRetrieve
Retrieves an object given data from a results list node. When a results
list contains all the information an application needs, retrieval by
DtSearch may not be required. For example when the documents themselves
are not stored in DtSearch databases and the document references are
available from the results list, the calling program may access the
objects directly.
DtSearch MessageList
All functions can potentially return multiple messages on a global linked
list of messages called the MessageList. Most unsuccessful return codes append
at least one message to the MessageList, but even successful returns may append
messages, and multiple messages are always possible.
Messages are standard C text strings terminated by a zero byte, and
were designed to be displayed directly to users.
Several API utility functions are available for manipulating the MessageList.
Fatal API Errors
Certain fatal errors will require an immediate abort from the engine.
By default fatal error messages will be written to the
stderr, but can be written to a text file specified
in DtSearchInit.
All API aborts are implemented through a call to
DtSearchExit. DtSearchExit()
ensures cleanup of a number of system resources before the final call to
exit. Developers can add an additional user exit
to DtSearchExit to specify additional emergency
clean up before process exit.
CONSTANTS
Function Return Code Constants
Most API functions return one of a set of standard integer return codes.
The return code DtSrOK means complete
success, other return codes indicate various levels of negative results or
failure.
DtSrOK
Normal, affirmative, successful
response.
DtSrNOTAVAIL
Generic negative response. For
example, no hits on search, no such record, etc.
DtSrFAIL
Miscellaneous unsuccessful engine
returns.
DtSrREINIT
Engine reinitialized, request canceled.
Often returned when invalid database name detected. Caller should clean up
and call DtSearchReinit().
DtSrERROR
Fatal caller programming error.
DtSrABORT
Fatal engine failure, caller must
abort.
Language Numbers
Each DtSearch database is associated with an integer representing among
other things the natural language of its documents. These constants are used
throughout the API to identify the supported languages.
DtSrLaENG
0
English, ASCII char set (default)
DtSrLaENG2
1
English, ISO Latin-1 char set
DtSrLaESP
2
Spanish, ISO Latin-1 char set
DtSrLaFRA
3
French, ISO Latin-1 char set
DtSrLaITA
4
Italian, ISO Latin-1 char set
DtSrLaDEU
5
German, ISO Latin-1 char set
DtSrLaJPN
6
Japanese, EUC, auto kanji compounds
DtSrLaJPN2
7
Japanese, EUC, listed kanji compounds
DtSrLaLAST
7
Last supported DtSrLa constant
Other General Constants
DtSrVERSION
DtSearch version number string.
DtSrMAX_KTNAME
Maximum string length of a keytype
name.
DtSrMAX_DB_KEYSIZE
Maximum size of the unique document
key.
DtSrMAXWIDTH_HWORD
Largest possible word or stem size.
DtSrMAX_STEMCOUNT
Maximum number of boolean search
terms.
DtSrObjdate Type
DtSrObjdate is a typdef for
an unsigned integer used as a date/time stamp for documents.
DtSearch queries may be qualified by document date ranges. The data
type packs certain standard struct tm fields into
bitmap fields to minimize space.
DtSrObjdate are based on the
western Gregorian calendar and are not guaranteed to map to other time locales.
DtSearch objdates have a range
from 1900 to 5995 inclusive and a resolution of 1 minute. From hi order bits
to low:
12 bits = tm_year
(0 - 4095, years since 1900 (1900
- 5995))
4 bits = tm_mon
(0 - 11, month name index)
5 bits = tm_mday
(1 - 31, day of month)
5 bits = tm_hour
(0 - 23, hours since midnight)
6 bits = tm_min
(0 - 59, minutes since top of hour)
STRUCTURES
DtSrKeytype Type
typedef struct {
char is_selected;
char ktchar;
char name [ DtSrMAX_KTNAME+1];
} DtSrKeytype;
A DtSearch keytype references a logical subset of the database.
The primary identifier for a keytype is the keytype character
ktchar. The ktchar
identifies the subset of the database that has that character as the first
character of its document keys.
The DtSrKeytype structure associates
the ktchar with a short name string for use in user GUI labels identifying the keytype, and
provides a boolean selection toggle for the keytype.
An array of DtSrKeytype structures
is maintained by the API for each database after API initialization. The API
function DgSearchGetKeytypes() is used to access the
array.
The is_selected boolean in each array
node indicates whether the user has selected that keytype to be returned in
the current search. The application must ensure that the boolean reflects
the current state of the user's desires prior to any search. Typically this
is done by having the keytypes array track user interface
toggle buttons for the database.
DtSrResult Structure
typedef struct _DtSrResult {
struct _DtSrResult *link;
long flags;
long objflags;
long objuflags;
long objsize;
DtSrObjdates objdate;
short objtype;
short objcost;
int dbn;
DB_ADDR dba;
short language;
char reckey [
DtSrMAX_DB_KEYSIZE];
int proximity;
char *abstractp;
} DtSrResult;
The API function DtSearchQuery returns a results
list upon successful completion of a search. A results list is a linked list
of DtSrResult structures, where each node represents
a database document that satisfied the query.
link
Pointer to the next results list node.
flags
(reserved)
objflags
The constant DtSrFlNOTAVAIL
means that the object is not retrievable from the search engine.
objuflags
User flags from database record. These are not used by DtSearch and
are available for application definition.
objsize
In uncompressed bytes.
objdate
Zero is the null date; document is 'undated'.
objtype
Document type from database header record. Objtype is typically used
by application code to identify and launch browsers.
Values above x1000 (4096) are set aside for application
definition. The following constants identify defined values:
DtSrObjUNKNOWN
Document type unknown or not applicable
DtSrObjTEXT
Generic, unformatted flat text
DtSrObjBINARY
Generic binary object
DtSrObjSGML
Generic SGML formatted document
DtSrObjHTML
HTML formatted document
DtSrObjPOSTSCR
Postscript document
DtSrObjINTERLF
Interleaf document
DtSrObjDTINFO
DtInfo document
objcost
(reserved)
dbn
Database number; index into dbnames array
from DtSearchInit and DtSearchReinit.
dba
Atomic document identifier within a database.
language
Language number of the database DtSrLa... constant).
reckey
Document's unique database key. The first character of reckey is the
keytype character.
proximity
Sort field for ranking results lists. Derived from frequency of occurrence
statistics for the query words in the document. Often displayed to users
as the subjective 'distance' between the document and the query, in other
words a measure of the likelihood that the document will satisfy the user's
needs.
abstractp
Document's abstract string from the database.
DtSrHitword Structure
typedef struct {
long offset; /* word location in cleartext */
long length; /* length of word */
} DtSrHitword;
Given a text string and the array of search terms returned from
DtSearchQuery,
DtSearchHighlight will generate a table of offsets
and lengths where the search terms are located in the text. The table is
typically used to highlight the search terms in the text is a manner
appropriate to the application's user interface.
The DtSrHitword structure is one element in the
table. For each search term to be highlighted,
offset specifies the beginning byte for the
term, and length specifies the extent
of the term in bytes.
SEE ALSO
&cdeman.DtSrAPI;,
&cdeman.DtSearchInit;,
&cdeman.DtSearchReinit;,
&cdeman.DtSearchExit;,
&cdeman.DtSearchGetKeytypes;,
&cdeman.DtSearchSetMaxResults;,
&cdeman.DtSearchGetMaxResults;,
&cdeman.DtSearchQuery;,
&cdeman.DtSearchRetrieve;,
&cdeman.DtSearchHighlight;,
&cdeman.DtSearchValidDateString;,
&cdeman.DtSearchMergeResults;,
&cdeman.DtSearchSortResults;,
&cdeman.DtSearchFreeResults;,
&cdeman.DtSearchHasMessages;,
&cdeman.DtSearchAddMessages;,
&cdeman.DtSearchGetMessages;,
&cdeman.DtSearchFreeMessages;,
&cdeman.DtSearch;