The iconv
function converts the text in the input buffer
according to the rules associated with the descriptor cd and
stores the result in the output buffer. It is possible to call the
function for the same text several times in a row since for stateful
character sets the necessary state information is kept in the data
structures associated with the descriptor.
The input buffer is specified by *
inbuf and it contains
*
inbytesleft bytes. The extra indirection is necessary for
communicating the used input back to the caller (see below). It is
important to note that the buffer pointer is of type char
and the
length is measured in bytes even if the input text is encoded in wide
characters.
The output buffer is specified in a similar way. *
outbuf
points to the beginning of the buffer with at least
*
outbytesleft bytes room for the result. The buffer
pointer again is of type char
and the length is measured in
bytes. If outbuf or *
outbuf is a null pointer, the
conversion is performed but no output is available.
If inbuf is a null pointer, the iconv
function performs the
necessary action to put the state of the conversion into the initial
state. This is obviously a no-op for non-stateful encodings, but if the
encoding has a state, such a function call might put some byte sequences
in the output buffer, which perform the necessary state changes. The
next call with inbuf not being a null pointer then simply goes on
from the initial state. It is important that the programmer never makes
any assumption as to whether the conversion has to deal with states.
Even if the input and output character sets are not stateful, the
implementation might still have to keep states. This is due to the
implementation chosen for the GNU C library as it is described below.
Therefore an iconv
call to reset the state should always be
performed if some protocol requires this for the output text.
The conversion stops for one of three reasons. The first is that all
characters from the input buffer are converted. This actually can mean
two things: either all bytes from the input buffer are consumed or
there are some bytes at the end of the buffer that possibly can form a
complete character but the input is incomplete. The second reason for a
stop is that the output buffer is full. And the third reason is that
the input contains invalid characters.
In all of these cases the buffer pointers after the last successful
conversion, for input and output buffer, are stored in inbuf and
outbuf, and the available room in each buffer is stored in
inbytesleft and outbytesleft.
Since the character sets selected in the iconv_open
call can be
almost arbitrary, there can be situations where the input buffer contains
valid characters, which have no identical representation in the output
character set. The behavior in this situation is undefined. The
current behavior of the GNU C library in this situation is to
return with an error immediately. This certainly is not the most
desirable solution; therefore, future versions will provide better ones,
but they are not yet finished.
If all input from the input buffer is successfully converted and stored
in the output buffer, the function returns the number of non-reversible
conversions performed. In all other cases the return value is
(size_t) -1
and errno
is set appropriately. In such cases
the value pointed to by inbytesleft is nonzero.
EILSEQ
- The conversion stopped because of an invalid byte sequence in the input.
After the call,
*
inbuf points at the first byte of the
invalid byte sequence.
E2BIG
- The conversion stopped because it ran out of space in the output buffer.
EINVAL
- The conversion stopped because of an incomplete byte sequence at the end
of the input buffer.
EBADF
- The cd argument is invalid.
The iconv
function was introduced in the XPG2 standard and is
declared in the iconv.h header.