The catgets functions can be used in two different ways. By
following slavishly the X/Open specs and not relying on the extension
and by using the GNU extensions. We will take a look at the former
method first to understand the benefits of extensions.
8.1.4.1 Not using symbolic names
Since the X/Open format of the message catalog files does not allow
symbol names we have to work with numbers all the time. When we start
writing a program we have to replace all appearances of translatable
strings with something like
catgets (catdesc, set, msg, "string")
catgets is retrieved from a call to catopen which is
normally done once at the program start. The "string" is the
string we want to translate. The problems start with the set and
message numbers.
In a bigger program several programmers usually work at the same time on
the program and so coordinating the number allocation is crucial.
Though no two different strings must be indexed by the same tuple of
numbers it is highly desirable to reuse the numbers for equal strings
with equal translations (please note that there might be strings which
are equal in one language but have different translations due to
difference contexts).
The allocation process can be relaxed a bit by different set numbers for
different parts of the program. So the number of developers who have to
coordinate the allocation can be reduced. But still lists must be keep
track of the allocation and errors can easily happen. These errors
cannot be discovered by the compiler or the catgets functions.
Only the user of the program might see wrong messages printed. In the
worst cases the messages are so irritating that they cannot be
recognized as wrong. Think about the translations for "true" and
"false" being exchanged. This could result in a disaster.
8.1.4.2 Using symbolic names
The problems mentioned in the last section derive from the fact that:
the numbers are allocated once and due to the possibly frequent use of
them it is difficult to change a number later.
the numbers do not allow to guess anything about the string and
therefore collisions can easily happen.
By constantly using symbolic names and by providing a method which maps
the string content to a symbolic name (however this will happen) one can
prevent both problems above. The cost of this is that the programmer
has to write a complete message catalog file while s/he is writing the
program itself.
This is necessary since the symbolic names must be mapped to numbers
before the program sources can be compiled. In the last section it was
described how to generate a header containing the mapping of the names.
E.g., for the example message file given in the last section we could
call the gencat program as follow (assume ex.msg contains
the sources).
gencat -H ex.h -o ex.cat ex.msg
This generates a header file with the following content:
As can be seen the various symbols given in the source file are mangled
to generate unique identifiers and these identifiers get numbers
assigned. Reading the source file and knowing about the rules will
allow to predict the content of the header file (it is deterministic)
but this is not necessary. The gencat program can take care for
everything. All the programmer has to do is to put the generated header
file in the dependency list of the source files of her/his project and
to add a rules to regenerate the header of any of the input files
change.
One word about the symbol mangling. Every symbol consists of two parts:
the name of the message set plus the name of the message or the special
string Set. So SetOnetwo means this macro can be used to
access the translation with identifier two in the message set
SetOne.
The other names denote the names of the message sets. The special
string Set is used in the place of the message identifier.
If in the code the second string of the set SetOne is used the C
code should look like this:
catgets (catdesc, SetOneSet, SetOnetwo,
" Message with ID \"two\", which gets the value 2 assigned")
Writing the function this way will allow to change the message number
and even the set number without requiring any change in the C source
code. (The text of the string is normally not the same; this is only
for this example.)
8.1.4.3 How does to this allow to develop
To illustrate the usual way to work with the symbolic version numbers
here is a little example. Assume we want to write the very complex and
famous greeting program. We start by writing the code as usual:
#include <stdio.h>
int
main (void)
{
printf ("Hello, world!\n");
return 0;
}
Now we want to internationalize the message and therefore replace the
message with whatever the user wants.
We see how the catalog object is opened and the returned descriptor used
in the other function calls. It is not really necessary to check for
failure of any of the functions since even in these situations the
functions will behave reasonable. They simply will be return a
translation.
What remains unspecified here are the constants SetMainSet and
SetMainHello. These are the symbolic names describing the
message. To get the actual definitions which match the information in
the catalog file we have to create the message catalog source file and
process it using the gencat program.
$ Messages for the famous greeting program.
$quote "
$set Main
Hello "Hallo, Welt!\n"
Now we can start building the program (assume the message catalog source
file is named hello.msg and the program source file hello.c):
The call of the gencat program creates the missing header file
msgnrs.h as well as the message catalog binary. The former is
used in the compilation of hello.c while the later is placed in a
directory in which the catopen function will try to locate it.
Please check the LC_ALL environment variable and the default path
for catopen presented in the description above.
Published under the terms of the GNU General Public License