tr - translate or delete characters


SYNOPSIS

       tr   [-cst]  [--complement]  [--squeeze-repeats]  [--trun-
       cate-set1] string1 string2
       tr {-s,--squeeze-repeats} [-c] [--complement] string1
       tr {-d,--delete} [-c] string1
       tr {-d,--delete}  {-s,--squeeze-repeats}  [-c]  [--comple-
       ment] string1 string2

       GNU tr also accepts the --help and --version options.


DESCRIPTION

       This  manual  page  documents  the  GNU version of tr.  tr
       copies the standard input to the standard output, perform-
       ing one of the following operations:

              o  translate, and optionally squeeze repeated char-
              acters in the result
              o squeeze repeated characters
              o delete characters
              o delete characters, then squeeze repeated  charac-
              ters from the result.

       The  string1  and  (if  given)  string2  arguments  define
       ordered sets of characters, referred to below as set1  and
       set2.   These sets are the characters of the input that tr
       operates on.  The --complement (-c) option  replaces  set1
       with its complement (all of the characters that are not in
       set1).

   SPECIFYING SETS OF CHARACTERS
       The format of the string1 and string2 arguments  resembles
       the  format  of regular expressions; however, they are not
       regular expressions, only lists of characters.  Most char-
       acters  simply  represent themselves in these strings, but
       the strings can contain the shorthands listed  below,  for
       convenience.   Some of them can be used only in string1 or
       string2, as noted below.

       Backslash escapes.  A backslash followed  by  a  character
       not listed below causes an error message.

       \a     Control-G.

       \b     Control-H.

       \f     Control-L.

       \n     Control-J.

       \r     Control-M.

       \v     Control-K.

       \ooo   The character with the value given by ooo, which is
              1 to 3 octal digits.

       \\     A backslash.

       Ranges.  The notation `m-n' expands to all of the  charac-
       ters  from m through n, in ascending order.  m should col-
       late before n; if it doesn't, an  error  results.   As  an
       example,  `0-9' is the same as `0123456789'.  Although GNU
       tr does not support the System V syntax that  uses  square
       brackets to enclose ranges, translations specified in that
       format will still work as long as the brackets in  string1
       correspond to identical brackets in string2.

       Repeated  characters.   The  notation  `[c*n]'  in string2
       expands to n copies of character c.  Thus, `[y*6]' is  the
       same  as `yyyyyy'.  The notation `[c*]' in string2 expands
       to as many copies of c as are needed to make set2 as  long
       as  set1.   If  n  begins  with  a 0, it is interpreted in
       octal, otherwise in decimal.

       Character classes.  The notation `[:class-name:]'  expands
       to  all  of the characters in the (predefined) class named
       class-name.  The characters expand in no particular order,
       except  for  the `upper' and `lower' classes, which expand
       in  ascending  order.   When   the   --delete   (-d)   and
       --squeeze-repeats (-s) options are both given, any charac-
       ter class can be used in  string2.   Otherwise,  only  the
       character  classes  `lower'  and  `upper'  are accepted in
       string2, and then  only  if  the  corresponding  character
       class  (`upper' and `lower', respectively) is specified in
       the same relative position in string1.  Doing this  speci-
       fies case conversion.  The class names are given below; an
       error results when an invalid class name is given.

       alnum  Letters and digits.

       alpha  Letters.

       blank  Horizontal whitespace.

       cntrl  Control characters.

       digit  Digits.

       graph  Printable characters, not including space.

       lower  Lowercase letters.


       space  Horizontal or vertical whitespace.

       upper  Uppercase letters.

       xdigit Hexadecimal digits.

       Equivalence classes.  The syntax `[=c=]' expands to all of
       the  characters that are equivalent to c, in no particular
       order.   Equivalence  classes  are  a   recent   invention
       intended  to  support  non-English  alphabets.   But there
       seems to be no standard way to define  them  or  determine
       their contents.  Therefore, they are not fully implemented
       in GNU tr; each  character's  equivalence  class  consists
       only  of  that  character, which makes this a useless con-
       struction currently.

   TRANSLATING
       tr performs translation when string1 and string2 are  both
       given  and  the  --delete  (-d)  option  is not given.  tr
       translates each character of its input that is in set1  to
       the  corresponding  character  in set2.  Characters not in
       set1 are  passed  through  unchanged.   When  a  character
       appears more than once in set1 and the corresponding char-
       acters in set2 are not all the same, only the final one is
       used.  For example, these two commands are equivalent:
              tr aaa xyz
              tr a z

       A  common  use of tr is to convert lowercase characters to
       uppercase.  This can be done in many ways.  Here are three
       of them:
              tr abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ
              tr a-z A-Z
              tr '[:lower:]' '[:upper:]'

       When  tr  is  performing translation, set1 and set2 should
       normally have the same length.  If set1  is  shorter  than
       set2, the extra characters at the end of set2 are ignored.

       On the other hand, making set1 longer  than  set2  is  not
       portable;  POSIX.2  says that the result is undefined.  In
       this situation, the BSD tr pads set2 to the length of set1
       by  repeating  the last character of set2 as many times as
       necessary.  The System V tr truncates set1 to  the  length
       of set2.

       By default, GNU tr handles this case like the BSD tr does.
       When the --truncate-set1 (-t) option is given, GNU tr han-
       dles  this case like the System V tr instead.  This option
       is ignored for operations other than translation.

              tr -cs A-Za-z0-9 '\012'
       because  it converts only zero bytes (the first element in
       the  complement   of   set1),   rather   than   all   non-
       alphanumerics, to newlines.

   SQUEEZING REPEATS AND DELETING
       When  given  just the --delete (-d) option, tr removes any
       input characters that are in set1.

       When given just  the  --squeeze-repeats  (-s)  option,  tr
       replaces  each input sequence of a repeated character that
       is in set1 with a single occurrence of that character.

       When given both the  --delete  and  the  --squeeze-repeats
       options,  tr first performs any deletions using set1, then
       squeezes repeats from any remaining characters using set2.

       The  --squeeze-repeats option may also be used when trans-
       lating, in which case tr first performs translation,  then
       squeezes repeats from any remaining characters using set2.

       Here are some examples to illustrate various  combinations
       of options:

       Remove all zero bytes:
              tr -d '\000'

       Put  all  words on lines by themselves.  This converts all
       non-alphanumeric characters  to  newlines,  then  squeezes
       each string of repeated newlines into a single newline:
              tr -cs '[a-zA-Z0-9]' '[\n*]'

       Convert  each  sequence  of  repeated newlines to a single
       newline:
              tr -s '\n'

       GNU tr also accepts the following options in any  combina-
       tion with the others.

       --help Print  a  usage message and exit with a status code
              indicating success.

       --version
              Print version information on standard  output  then
              exit.

   WARNING MESSAGES
       Setting the environment variable POSIXLY_CORRECT turns off
       several warning and error messages, for strict  compliance
       with  POSIX.2.  The messages normally occur in the follow-
       ing circumstances:

       default prints a usage message and exits, because  string2
       would  not  be  used.   The  POSIX specification says that
       string2 must be ignored in this case.   Silently  ignoring
       arguments is a bad idea.

       2.  When an ambiguous octal escape is given.  For example,
       \400 is actually \40 followed by the digit 0, because  the
       value 400 octal does not fit into a single byte.

       Note that GNU tr does not provide complete BSD or System V
       compatibility.  For example, there is no option to disable
       interpretation  of  the POSIX constructs [:alpha:], [=c=],
       and [c*10].  Also, GNU tr does not delete zero bytes auto-
       matically, unlike traditional UNIX versions, which provide
       no way to preserve zero bytes.






































Release 1.1d7 of the Be OS


Go back to the index.