2008-11-12

UNIX Tip: the tr command, newlines and more

tr is used to translate characters. It can translate all uppercase letters to lowercase. It can delete all occurances of a single character, and much more.

To turn all uppercase letters lowercase (a useful thing when parsing user input from a shell script), use this:

bash-3.2$ echo This Is CrAzY! | tr [A-Z] [a-z]
this is crazy!
Using tr -d, you can remove any characters from a string as well. Here, I remove several punctuation marks:
bash-3.2$ echo "You're right. That *is* crazy." | tr -d \'\*\!\?\.
Youre right That is crazy

One of the best ways to use tr, though, is for converting newline characters to something useful.

When running shell commands or writing shell scripts, most people can quickly adapt to using commands such as cut or awk to manipulate tabular data. See also: more awk recipes. Let's say we have a text file with the following values, and we want this data to be represented in CSV format, all on one line. This is where newlines make things tricky.
0.00
128.00
49.00
7031
4218
tr can tackle it easily, using "\n" to represent the newline character. Just translate newlines to commas, and you're good to go:
bash-3.2$ cat file | tr "\n" ","
0.00,128.00,49.00,7031,4218,
Similarly, you can take large chunks of text and replace all newlines with a space if you wish to un-do old-school text-editor formatting before importing into a word processor.

tr is a somewhat obscure command, but it can save a lot of headache, particularly when trying to format text output. The same result can be achieved (albeit with much more complex commands) with awk or sed, so this is a good one to know about, and it's found on almost every posix-compliant platform (and in cygwin)

blog comments powered by Disqus