Some checks failed
Docker. / Ubuntu (push) Has been cancelled
User-agent updater. / User-agent (push) Failing after 15s
Lock Threads / lock (push) Failing after 10s
Waiting for answer. / waiting-for-answer (push) Failing after 22s
Close stale issues and PRs / stale (push) Successful in 13s
Needs user action. / needs-user-action (push) Failing after 8s
Can't reproduce. / cant-reproduce (push) Failing after 8s
413 lines
12 KiB
Groff
413 lines
12 KiB
Groff
.TH hunspell 1 "2014-05-27"
|
|
.LO 1
|
|
.SH NAME
|
|
hunspell \- spell checker, stemmer and morphological analyzer
|
|
.SH SYNOPSIS
|
|
hunspell [\-1aDGHhLlmnOrstvwX] [\-\-check\-url] [\-\-check\-apostrophe] [\-d dict[,dict2,...]] [\-\-help] [\-i enc] [\-p dict] [\-vv] [\-\-version] [text/OpenDocument/TeX/LaTeX/HTML/SGML/XML/nroff/troff file(s)]
|
|
.SH DESCRIPTION
|
|
.I Hunspell
|
|
is fashioned after the
|
|
.I Ispell
|
|
program. The most common usage is "hunspell" or "hunspell filename".
|
|
Without filename parameter, hunspell checks the standard input.
|
|
Typing "cat" and "exsample" in two input lines, results an asterisk
|
|
(it means "cat" is a correct word) and a line with corrections:
|
|
.PP
|
|
.RS
|
|
.nf
|
|
$ hunspell -d en_US
|
|
Hunspell 1.2.3
|
|
*
|
|
& exsample 4 0: example, examples, ex sample, ex-sample
|
|
.fi
|
|
.RE
|
|
.PP
|
|
Correct words signed with an '*', '+' or '-', unrecognized
|
|
words signed with '#' or '&' in output lines (see later).
|
|
(Close the standard input with Ctrl-d on Unix/Linux and
|
|
Ctrl-Z Enter or Ctrl-C on Windows.)
|
|
.PP
|
|
With filename parameters,
|
|
.I hunspell
|
|
will display each word of the files which does not appear in the dictionary at the
|
|
top of the screen and allow you to change it. If there are "near
|
|
misses" in the dictionary, then they are
|
|
also displayed on following lines.
|
|
Finally, the line containing the
|
|
word and the previous line
|
|
are printed at the bottom of the screen. If your terminal can
|
|
display in reverse video, the word itself is highlighted. You have the
|
|
option of replacing the word completely, or choosing one of the
|
|
suggested words. Commands are single characters as follows
|
|
(case is ignored):
|
|
.PP
|
|
.RS
|
|
.IP R
|
|
Replace the misspelled word completely.
|
|
.IP Space
|
|
Accept the word this time only.
|
|
.IP A
|
|
Accept the word for the rest of this
|
|
.I hunspell
|
|
session.
|
|
.IP I
|
|
Accept the word, capitalized as it is in the
|
|
file, and update private dictionary.
|
|
.IP U
|
|
Accept the word, and add an uncapitalized (actually, all lower-case)
|
|
version to the private dictionary.
|
|
.IP S
|
|
Ask a stem and a model word and store them in the private dictionary.
|
|
The stem will be accepted also with the affixes of the model word.
|
|
.IP 0-\fIn\fR
|
|
Replace with one of the suggested words.
|
|
.IP X
|
|
Write the rest of this file, ignoring misspellings, and start next file.
|
|
.IP Q
|
|
Exit immediately and leave the file unchanged.
|
|
.IP ^Z
|
|
Suspend hunspell.
|
|
.IP ?
|
|
Give help screen.
|
|
.RE
|
|
.SH OPTIONS
|
|
.IP \fB\-1\fR
|
|
Check only first field in lines (delimiter = tabulator).
|
|
.IP \fB\-a\fR
|
|
The
|
|
.B \-a
|
|
option
|
|
is intended to be used from other programs through a pipe. In this
|
|
mode,
|
|
.I hunspell
|
|
prints a one-line version identification message, and then begins
|
|
reading lines of input. For each input line,
|
|
a single line is written to the standard output for each word
|
|
checked for spelling on the line. If the word
|
|
was found in the main dictionary, or your personal dictionary, then the
|
|
line contains only a '*'. If the word was found through affix removal,
|
|
then the line contains a '+', a space, and the root word.
|
|
If the word was found through compound formation (concatenation of two
|
|
words, then the line contains only a '\-'.
|
|
.IP ""
|
|
If the word
|
|
is not in the dictionary, but there are near misses, then the line
|
|
contains an '&', a space, the misspelled word, a space, the number of
|
|
near misses,
|
|
the number of
|
|
characters between the beginning of the line and the
|
|
beginning of the misspelled word, a colon, another space,
|
|
and a list of the near
|
|
misses separated by
|
|
commas and spaces.
|
|
.IP ""
|
|
Also, each near miss or guess is capitalized the same as the input
|
|
word unless such capitalization is illegal;
|
|
in the latter case each near miss is capitalized correctly
|
|
according to the dictionary.
|
|
.IP ""
|
|
Finally, if the word does not appear in the dictionary, and
|
|
there are no near misses, then the line contains a '#', a space,
|
|
the misspelled word, a space,
|
|
and the character offset from the beginning of the line.
|
|
Each sentence of text input is terminated
|
|
with an additional blank line, indicating that
|
|
.I hunspell
|
|
has completed processing the input line.
|
|
.IP ""
|
|
These output lines can be summarized as follows:
|
|
.RS
|
|
.IP OK:
|
|
*
|
|
.IP Root:
|
|
+ <root>
|
|
.IP Compound:
|
|
\-
|
|
.IP Miss:
|
|
& <original> <count> <offset>: <miss>, <miss>, ...
|
|
.IP None:
|
|
# <original> <offset>
|
|
.RE
|
|
.IP ""
|
|
For example, a dummy dictionary containing the words "fray", "Frey",
|
|
"fry", and "refried" might produce the following response to the
|
|
command "echo 'frqy refries | hunspell \-a":
|
|
.RS
|
|
.nf
|
|
(#) Hunspell 0.4.1 (beta), 2005-05-26
|
|
& frqy 3 0: fray, Frey, fry
|
|
& refries 1 5: refried
|
|
.fi
|
|
.RE
|
|
.IP ""
|
|
This mode
|
|
is also suitable for interactive use when you want to figure out the
|
|
spelling of a single word (but this is the default behavior of hunspell
|
|
without -a, too).
|
|
.IP ""
|
|
When in the
|
|
.B \-a
|
|
mode,
|
|
.I hunspell
|
|
will also accept lines of single words prefixed with any
|
|
of '*', '&', '@', '+', '\-', '~', '#', '!', '%', '`', or '^'.
|
|
A line starting with '*' tells
|
|
.I hunspell
|
|
to insert the word into the user's dictionary (similar to the I command).
|
|
A line starting with '&' tells
|
|
.I hunspell
|
|
to insert an all-lowercase version of the word into the user's
|
|
dictionary (similar to the U command).
|
|
A line starting with '@' causes
|
|
.I hunspell
|
|
to accept this word in the future (similar to the A command).
|
|
A line starting with '+', followed immediately by
|
|
.B tex
|
|
or
|
|
.B nroff
|
|
will cause
|
|
.I hunspell
|
|
to parse future input according the syntax of that formatter.
|
|
A line consisting solely of a '+' will place
|
|
.I hunspell
|
|
in TeX/LaTeX mode (similar to the
|
|
.B \-t
|
|
option) and '\-' returns
|
|
.I hunspell
|
|
to nroff/troff mode (but these commands are obsolete).
|
|
However, the string character type is
|
|
.I not
|
|
changed;
|
|
the '~' command must be used to do this.
|
|
A line starting with '~' causes
|
|
.I hunspell
|
|
to set internal parameters (in particular, the default string
|
|
character type) based on the filename given in the rest of the line.
|
|
(A file suffix is sufficient, but the period must be included.
|
|
Instead of a file name or suffix, a unique name, as listed in the language
|
|
affix file, may be specified.)
|
|
However, the formatter parsing is
|
|
.I not
|
|
changed; the '+' command must be used to change the formatter.
|
|
A line prefixed with '#' will cause the
|
|
personal dictionary to be saved.
|
|
A line prefixed with '!' will turn on
|
|
.I terse
|
|
mode (see below), and a line prefixed with '%' will return
|
|
.I hunspell
|
|
to normal (non-terse) mode.
|
|
A line prefixed with '`' will turn on verbose-correction mode (see below);
|
|
this mode can only be disabled by turning on terse mode with '%'.
|
|
.IP ""
|
|
Any input following the prefix
|
|
characters '+', '\-', '#', '!', '%', or '`' is ignored, as is any input
|
|
following the filename on a '~' line.
|
|
To allow spell-checking of lines beginning with these characters, a
|
|
line starting with '^' has that character removed before it is passed
|
|
to the spell-checking code.
|
|
It is recommended that programmatic interfaces prefix every data line
|
|
with an uparrow to protect themselves against future changes in
|
|
.IR hunspell .
|
|
.IP ""
|
|
To summarize these:
|
|
.IP ""
|
|
.RS
|
|
.IP *
|
|
Add to personal dictionary
|
|
.IP @
|
|
Accept word, but leave out of dictionary
|
|
.IP #
|
|
Save current personal dictionary
|
|
.IP ~
|
|
Set parameters based on filename
|
|
.IP +
|
|
Enter TeX mode
|
|
.IP \-
|
|
Exit TeX mode
|
|
.IP !
|
|
Enter terse mode
|
|
.IP %
|
|
Exit terse mode
|
|
.IP "`"
|
|
Enter verbose-correction mode
|
|
.IP ^
|
|
Spell-check rest of line
|
|
.fi
|
|
.RE
|
|
.IP ""
|
|
In
|
|
.I terse
|
|
mode,
|
|
.I hunspell
|
|
will not print lines beginning with '*', '+', or '\-', all of which
|
|
indicate correct words.
|
|
This significantly improves running speed when the driving program is
|
|
going to ignore correct words anyway.
|
|
.IP ""
|
|
In
|
|
.I verbose-correction
|
|
mode,
|
|
.I hunspell
|
|
includes the original word immediately after the indicator character
|
|
in output lines beginning with '*', '+', and '\-', which simplifies
|
|
interaction for some programs.
|
|
|
|
.IP \fB\-\-check\-apostrophe\fR
|
|
Check and force Unicode apostrophes (U+2019), if one of the ASCII or Unicode
|
|
apostrophes is specified by the spelling dictionary, as a word character
|
|
(see WORDCHARS, ICONV and OCONV in hunspell(5)).
|
|
.IP \fB\-\-check\-url\fR
|
|
Check URLs, e-mail addresses and directory paths.
|
|
|
|
.IP \fB\-D\fR
|
|
Show detected path of the loaded dictionary, and list of the
|
|
search path and the available dictionaries.
|
|
|
|
.IP \fB\-d\ dict,dict2,...\fR
|
|
Set dictionaries by their base names with or without paths.
|
|
Example of the syntax:
|
|
.PP
|
|
\-d en_US,en_geo,en_med,de_DE,de_med
|
|
.PP
|
|
en_US and de_DE are base dictionaries, they consist of
|
|
aff and dic file pairs: en_US.aff, en_US.dic and de_DE.aff, de_DE.dic.
|
|
En_geo, en_med, de_med are special dictionaries: dictionaries
|
|
without affix file. Special dictionaries are optional extension
|
|
of the base dictionaries usually with special (medical, law etc.)
|
|
terms. There is no naming convention for special dictionaries,
|
|
only the ".dic" extension: dictionaries without affix file will
|
|
be an extension of the preceding base dictionary (right
|
|
order of the parameter list needs for good suggestions). First
|
|
item of \-d parameter list must be a base dictionary.
|
|
|
|
.IP \fB\-G\fR
|
|
Print only correct words or lines.
|
|
|
|
.IP \fB\-H\fR
|
|
The input file is in SGML/HTML format.
|
|
|
|
.IP \fB\-h,\ \-\-help\fR
|
|
Short help.
|
|
|
|
.IP \fB\-i\ enc\fR
|
|
Set input encoding.
|
|
|
|
.IP \fB\-L\fR
|
|
Print lines with misspelled words.
|
|
|
|
.IP \fB\-l\fR
|
|
The "list" option
|
|
is used to produce a list of misspelled words from the standard input.
|
|
|
|
.IP \fB\-m\fR
|
|
Analyze the words of the input text (see also hunspell(5) about
|
|
morphological analysis). Without dictionary morphological data,
|
|
signs the flags of the affixes of the word forms for dictionary
|
|
developers.
|
|
|
|
.IP \fB\-n\fR
|
|
The input file is in nroff/troff format.
|
|
|
|
.IP \fB\-O\fR
|
|
The input file is in OpenDocument (ODF or Flat ODF) format.
|
|
If unzip program is not installed, install it before using this option.
|
|
|
|
.IP \fB\-P\ password\fR
|
|
Set password for encrypted dictionaries.
|
|
|
|
.IP \fB\-p\ dict\fR
|
|
Set path of personal dictionary.
|
|
The default dictionary depends on the locale settings. The
|
|
following environment variables are searched: LC_ALL,
|
|
LC_MESSAGES, and LANG. If none are set then the default personal
|
|
dictionary is $HOME/.hunspell_default.
|
|
|
|
Setting
|
|
.I \-d
|
|
or the
|
|
.I DICTIONARY
|
|
environmental variable, personal dictionary will be
|
|
.BR $HOME/.hunspell_dicname
|
|
|
|
.IP \fB\-r\fR
|
|
Warn of the rare words, which are also potential spelling mistakes.
|
|
|
|
.IP \fB\-s\fR
|
|
Stem the words of the input text (see also hunspell(5) about
|
|
stemming). It depends from the dictionary data.
|
|
|
|
.IP \fB\-t\fR
|
|
The input file is in TeX or LaTeX format.
|
|
|
|
.IP \fB\-v,\ \-\-version\fR
|
|
Print version number.
|
|
|
|
.IP \fB\-vv\fR
|
|
Print ispell(1) compatible version number.
|
|
|
|
.IP \fB\-w\fR
|
|
Print misspelled words (= lines) from one word/line input.
|
|
|
|
.IP \fB\-X\fR
|
|
The input file is in XML format.
|
|
|
|
.SH EXAMPLES
|
|
.TP
|
|
.B hunspell example.html
|
|
Interactive spell checking of an HTML file with the default dictionary.
|
|
.TP
|
|
.B hunspell \-d en_US example.html
|
|
Interactive spell checking of an HTML file with the en_US dictionary.
|
|
.TP
|
|
.B hunspell \-d en_US,en_US_med medical.txt
|
|
Interactive spell checking with multiple dictionaries.
|
|
.TP
|
|
.B hunspell *.odt
|
|
Interactive spell checking of ODF documents.
|
|
.TP
|
|
.B hunspell \-l *.odt
|
|
List bad words of ODF documents
|
|
.TP
|
|
.B hunspell \-l *.odt | sort | uniq >unrecognized
|
|
Saving unrecognized words of ODF documents (filtering duplications).
|
|
.TP
|
|
.B hunspell -p unrecognized_but_good *.odt
|
|
Interactive spell checking of ODF documents, using the previously
|
|
saved and reduced word list, as a personal dictionary, to speed up
|
|
spell checking.
|
|
.TP
|
|
.SH ENVIRONMENT
|
|
.TP
|
|
.B DICTIONARY
|
|
Similar to
|
|
.I \-d.
|
|
.TP
|
|
.B DICPATH
|
|
Dictionary path.
|
|
.TP
|
|
.B WORDLIST
|
|
Equivalent to
|
|
.I \-p.
|
|
.SH FILES
|
|
The default dictionary depends on the locale settings. The
|
|
following environment variables are searched: LC_ALL,
|
|
LC_MESSAGES, and LANG. If none are set then the following
|
|
fallbacks are used:
|
|
|
|
.BI /usr/share/myspell/default.aff
|
|
Path of default affix file. See hunspell(5).
|
|
.PP
|
|
.BI /usr/share/myspell/default.dic
|
|
Path of default dictionary file.
|
|
See hunspell(5).
|
|
.PP
|
|
.BI $HOME/.hunspell_default.
|
|
Default path to personal dictionary.
|
|
.SH SEE ALSO
|
|
.B hunspell (3), hunspell(5)
|
|
.SH AUTHOR
|
|
Author of Hunspell executable is László Németh. For Hunspell library,
|
|
see hunspell(3).
|
|
.PP
|
|
This manual based on Ispell's manual. See ispell(1).
|