Mac OSX' UTF-8 Decomposed Normalization Form


There exists a problem for the interoperability with Linux and probably other operating systems too, in how Mac OSX file system encodes file names.

Mac OSX file system uses UTF-8 character encoding for the file names like other modern operating systems do, namely Linux. But unlike all other known systems, Mac OSX uses UTF-8 NFD, Normalization Form Decomposed, and cannot handle NFC (Normalization Form preComposed).

This can impose some strange problems when moving files back and forth from Mac OSX to Linux especially when tar is involved.

A perfect tool to fix filename, directory, and even whole filesystem encodings to a different one is convmv.

A short description of the program can be found on the Project page of convmv at Freshmeat.

line
linux logo Powered by Apache
line
This site maintained by:
lukas.zimmermann@unibas.ch
My public PGP key
last updated: 2006-12-08 Valid CSS! Valid XHTML 1.0 Strict