- Filters (UNIX System/dos formats)
-
The following filters allow you to change text from Dos-style to UNIX system style and vice-versa, or convert a file to other formats. Also note that many modern
text editors can do this for you...
- Why use filters?
-
Because UNIX systems and Microsoft use two different standards to represent the end-of-line in an ASCII text file.
This can sometimes causes problems in editors or viewers which aren't familiar with the other operating systems end-of-line style. The following tools allow you to get around this difference.
- Whats the difference?
-
The difference is very simple, on a Windows text file, a newline is signalled by a carriage return followed by a newline, '\r\n' in ASCII.
On a UNIX system a newline is simply a newline, '\n' in ASCII.
-
dos2unix This converts Microsoft-style end-of-line characters to UNIX system style end of line characters.
Simply type:
-
fromdos This does the same as dos2unix (above).
Simply type:
fromdos can be obtained from the from/to dos website.
-
unix2dos This converts UNIX system style end of line characters to Microsoft-sty le end-of-line characters.
Simply type:
-
todos This does the same as unix2dos (above).
Simply type:
todos can be obtained from the from/to dos website.
-
antiword This filter converts Microsoft word documents into plain ASCII text documents.
Simply type:
You can get antiword from the antiword homepage.
-
recode Converts text files between various formats including HTML and dozens of different forms of text encodings.
Use recode -l for a full listing. It can also be used to convert text to and from Windows and UNIX system formats (so you
don't get the weird symbols).
|
Warning |
|
By default recode overwrites the input file, use '<' to use recode as a filter only (and to not overwrite the file).
|
UNIX system text to Windows text:
Windows text to UNIX system text:
UNIX system Text to Windows Text without overwriting the original file (and creating a new output file):
recode ..pc < file_name > recoded_file
|
-
tr (Windows to UNIX system style conversion only). While tr is not specifically designed to convert files from Windows-format to UNIX system format by
doing:
tr -d '\r' < inputFile.txt > outputFile.txt
|
The -d switch means to simply delete any occurances of the string. Since we are looking for '\r', carriage returns it will remove any it finds, making the file a UNIX system text file...