F.2 International Character Set Support on Mac
Mac uses non-standard encodings for the upper 128 single-byte
characters. They also deviate from the ISO 2022 standard by using
character codes in the range 128-159. The coding systems
mac-roman
, mac-centraleurroman
, and mac-cyrillic
are used to represent these Mac encodings.
The fontset fontset-mac
is created automatically when Emacs
is run on Mac, and used by default. It displays as many kinds of
characters as possible using 12-point Monaco as a base font. If you
see some character as a hollow box with this fontset, then it's almost
impossible to display it only by customizing font settings (see Mac Font Specs).
You can use input methods provided either by LEIM (see Input Methods) or Mac OS to enter international characters. To use the
former, see the International Character Set Support section of the
manual (see International).
Emacs on Mac OS automatically changes the value of
keyboard-coding-system
according to the current keyboard
layout. So users don't need to set it manually, and even if set, it
will be changed when the keyboard layout change is detected next time.
The Mac clipboard and the Emacs kill ring (see Killing) are
synchronized by default: you can yank a piece of text and paste it
into another Mac application, or cut or copy one in another Mac
application and yank it into a Emacs buffer. This feature can be
disabled by setting x-select-enable-clipboard
to nil
.
One can still do copy and paste with another application from the Edit
menu.
On Mac, the role of the coding system for selection that is set by
set-selection-coding-system
(see Specify Coding) is
two-fold. First, it is used as a preferred coding system for the
traditional text flavor that does not specify any particular encodings
and is mainly used by applications on Mac OS Classic. Second, it
specifies the intermediate encoding for the UTF-16 text flavor that is
mainly used by applications on Mac OS X.
When pasting UTF-16 text data from the clipboard, it is first
converted to the encoding specified by the selection coding system
using the converter in the Mac OS system, and then decoded into the
Emacs internal encoding using the converter in Emacs. If the first
conversion failed, then the UTF-16 data is converted similarly but via
UTF-8. Copying UTF-16 text to the clipboard goes through the inverse
path. The reason for this two-pass decoding is to avoid subtle
differences in Unicode mappings between the Mac OS system and Emacs
such as various kinds of hyphens, to deal with UTF-16 data in native
byte order with no byte order mark, and to minimize users'
customization. For example, users that mainly use Latin characters
would prefer Greek characters to be decoded into the
mule-unicode-0100-24ff
charset, but Japanese users would prefer
them to be decoded into the japanese-jisx0208
charset. Since
the coding system for selection is automatically set according to the
system locale setting, users usually don't have to set it manually.
The default language environment (see Language Environments) is
set according to the locale setting at the startup time. On Mac OS,
the locale setting is consulted in the following order:
- Environment variables LC_ALL, LC_CTYPE and LANG as
in other systems.
- Preference
AppleLocale
that is set by default on Mac OS X 10.3
and later.
- Preference
AppleLanguages
that is set by default on Mac OS X
10.1 and later.
- Variable
mac-system-locale
that is derived from the system
language and region codes. This variable is available on all
supported Mac OS versions including Mac OS Classic.
The default values of almost all variables about coding systems are
also set according to the language environment. So usually you don't
have to customize these variables manually.