27.13 Single-byte Character Set Support
The ISO 8859 Latin-n character sets define character codes in
the range 0240 to 0377 octal (160 to 255 decimal) to handle the
accented letters and punctuation needed by various European languages
(and some non-European ones). If you disable multibyte characters,
Emacs can still handle one of these character codes at a time.
To specify which of these codes to use, invoke M-x
set-language-environment and specify a suitable language environment
such as ‘Latin-n’.
For more information about unibyte operation, see Enabling Multibyte. Note particularly that you probably want to ensure that
your initialization files are read as unibyte if they contain non-ASCII
characters.
Emacs can also display those characters, provided the terminal or font
in use supports them. This works automatically. Alternatively, if you
are using a window system, Emacs can also display single-byte characters
through fontsets, in effect by displaying the equivalent multibyte
characters according to the current language environment. To request
this, set the variable unibyte-display-via-language-environment
to a non-nil
value.
If your terminal does not support display of the Latin-1 character
set, Emacs can display these characters as ASCII sequences which at
least give you a clear idea of what the characters are. To do this,
load the library iso-ascii
. Similar libraries for other
Latin-n character sets could be implemented, but we don't have
them yet.
Normally non-ISO-8859 characters (decimal codes between 128 and 159
inclusive) are displayed as octal escapes. You can change this for
non-standard “extended” versions of ISO-8859 character sets by using the
function standard-display-8bit
in the disp-table
library.
There are two ways to input single-byte non-ASCII
characters:
- You can use an input method for the selected language environment.
See Input Methods. When you use an input method in a unibyte buffer,
the non-ASCII character you specify with it is converted to unibyte.
- If your keyboard can generate character codes 128 (decimal) and up,
representing non-ASCII characters, you can type those character codes
directly.
On a window system, you should not need to do anything special to use
these keys; they should simply work. On a text-only terminal, you
should use the command M-x set-keyboard-coding-system
or the
variable keyboard-coding-system
to specify which coding system
your keyboard uses (see Specify Coding). Enabling this feature
will probably require you to use ESC to type Meta characters;
however, on a console terminal or in xterm
, you can arrange for
Meta to be converted to ESC and still be able type 8-bit
characters present directly on the keyboard or using Compose or
AltGr keys. See User Input.
- For Latin-1 only, you can use the key C-x 8 as a “compose
character” prefix for entry of non-ASCII Latin-1 printing
characters. C-x 8 is good for insertion (in the minibuffer as
well as other buffers), for searching, and in any other context where
a key sequence is allowed.
C-x 8 works by loading the iso-transl
library. Once that
library is loaded, the <ALT> modifier key, if the keyboard has
one, serves the same purpose as C-x 8: use <ALT> together
with an accent character to modify the following letter. In addition,
if the keyboard has keys for the Latin-1 “dead accent characters,”
they too are defined to compose with the following character, once
iso-transl
is loaded.
Use C-x 8 C-h to list all the available C-x 8 translations.