Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Programming
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Databases
Mail Systems
openSolaris
Eclipse Documentation
Techotopia.com
Virtuatopia.com
Answertopia.com

How To Guides
Virtualization
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Windows
Problem Solutions
Privacy Policy

  




 

 

Samba HowTo Guide
Prev Home Next

Japanese Charsets

Setting up Japanese charsets is quite difficult. This is mainly because:

  • The Windows character set is extended from the original legacy Japanese standard (JIS X 0208) and is not standardized. This means that the strictly standardized implementation cannot support the full Windows character set.

  • Mainly for historical reasons, there are several encoding methods in Japanese, which are not fully compatible with each other. There are two major encoding methods. One is the Shift_JIS series used in Windows and some UNIXes. The other is the EUC-JP series used in most UNIXes and Linux. Moreover, Samba previously also offered several unique encoding methods, named CAP and HEX, to keep interoperability with CAP/NetAtalk and UNIXes that can't use Japanese filenames. Some implementations of the EUC-JP series can't support the full Windows character set.

  • There are some code conversion tables between Unicode and legacy Japanese character sets. One is compatible with Windows, another one is based on the reference of the Unicode consortium, and others are a mixed implementation. The Unicode consortium does not officially define any conversion tables between Unicode and legacy character sets, so there cannot be standard one.

  • The character set and conversion tables available in iconv() depend on the iconv library that is available. Next to that, the Japanese locale names may be different on different systems. This means that the value of the charset parameters depends on the implementation of iconv() you are using.

    Though 2-byte fixed UCS-2 encoding is used in Windows internally, Shift_JIS series encoding is usually used in Japanese environments as ASCII encoding is in English environments.

Samba HowTo Guide
Prev Home Next

 
 
  Published under the terms fo the GNU General Public License Design by Interspire