Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Programming
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Databases
Mail Systems
openSolaris
Eclipse Documentation
Techotopia.com
Virtuatopia.com
Answertopia.com

How To Guides
Virtualization
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Windows
Problem Solutions
Privacy Policy

  




 

 

11.5. Text Conversion/Filter Tools

Filters (UNIX System/dos formats)

The following filters allow you to change text from Dos-style to UNIX system style and vice-versa, or convert a file to other formats. Also note that many modern text editors can do this for you...

Why use filters?

Because UNIX systems and Microsoft use two different standards to represent the end-of-line in an ASCII text file.

This can sometimes causes problems in editors or viewers which aren't familiar with the other operating systems end-of-line style. The following tools allow you to get around this difference.

Whats the difference?

The difference is very simple, on a Windows text file, a newline is signalled by a carriage return followed by a newline, '\r\n' in ASCII.

On a UNIX system a newline is simply a newline, '\n' in ASCII.

dos2unix This converts Microsoft-style end-of-line characters to UNIX system style end of line characters.

Simply type:

dos2unix file.txt

fromdos This does the same as dos2unix (above).

Simply type:

fromdos file.txt

fromdos can be obtained from the from/to dos website.

unix2dos This converts UNIX system style end of line characters to Microsoft-sty le end-of-line characters.

Simply type:

unix2dos file.txt

todos This does the same as unix2dos (above).

Simply type:

todos file.txt

todos can be obtained from the from/to dos website.

antiword This filter converts Microsoft word documents into plain ASCII text documents.

Simply type:

antiword file.doc

You can get antiword from the antiword homepage.

recode Converts text files between various formats including HTML and dozens of different forms of text encodings.

Use recode -l for a full listing. It can also be used to convert text to and from Windows and UNIX system formats (so you don't get the weird symbols).

Caution Warning
 

By default recode overwrites the input file, use '<' to use recode as a filter only (and to not overwrite the file).

Examples:

 

UNIX system text to Windows text:

recode ..pc file_name

Windows text to UNIX system text:

recode ..pc/ file_name

UNIX system Text to Windows Text without overwriting the original file (and creating a new output file):

recode ..pc < file_name > recoded_file

tr (Windows to UNIX system style conversion only). While tr is not specifically designed to convert files from Windows-format to UNIX system format by doing:

tr -d '\r' < inputFile.txt > outputFile.txt

The -d switch means to simply delete any occurances of the string. Since we are looking for '\r', carriage returns it will remove any it finds, making the file a UNIX system text file...

11.5.1. Conversion tools

enscript Converts text files to postscript, rtf, HTML (use ghostview to view the postscript file). enscript has a large number of options which can be used to customise the output.

Examples:[1]

enscript --language=html input_file.txt -o output_file.html

This will take some file and output it as a html file.

enscript --help-highlight

Display help on using the highlight feature (list all different types of highlighting available)

-E[lang]

Highlight using the lang (pretty print), example:

enscript -E --color --language=html --toc -pfoo.html *.h *.c 

Add all the files with a .h and a .c (C source and header files) into a file called foo.html, use colour and add a table of contents

For further options refer to the well written manual page of enscript.

figlet Used to create ASCII "art". Figlet can create several different forms (fonts) of ASCII art, its one of the more unusual programs around.

Notes

[1]

These examples are based off information from the enscript manual page, see [12] in the Bibliography for further information.

 
 
  Published under the terms of the GNU General Public License Design by Interspire