Table of Contents

Name

tabmath - general table manipulator

Synopsis

tabmath in=table-file out=table-file [parameter=value] ...

Description

tabmath has a few simple functions to modify a file in tabular format, similar to a spreadsheet calculator. It can handle an arbitrary number of files, though the total number of columns (old and new) is limited (currently 256). Column operations are most common, although a limited number of row operations are also possible.

New columns are computed with the keyword newcol, and needs a functional description of the new columns, usually in terms of previous columns. Any previous columns (including the ones just created) must be referenced by number through an escape (%) command. e.g. newcol=sqrt(%1) creates a new column which is the sqrt of the first one. Valid column references are %1 and up. Column reference to 0 (i.e. %0) is interpreted as the line number (1 and up, excluding comment lines). Although the escape command can also be a dollar ($) sign, this creates some discomfort in a Unix command line environment.

The keyword delcol has to be used to delete computed columns, the word all can be used to delete all original input columns.

Rows can also be deleted/selected based on a function evaluation of the current column values. For this the selfie= keyword can be used. It uses the same nemoinp(3NEMO) syntax as those for the new column expressions. It must evaluate to exactly 0 in order for a row not to be written. For example "selfie=iflt(%1,3,1,0)" would only output rows in which the first column is less than 3.

Tablefiles are simple ascii files in a columnar format. Lines which start with a # symbol are interpreted as comment lines and are skipped.

Parameters

The following parameters are recognized in any order if the keyword is also given:
in=in-file
input (ascii) file, a table of values from which data is taken. Multiple files must be separated by comma’s. [no default].
out=out-file
Output filename where the output data are written [no default].
newcol=expression
Expression with which the newly added column(s) will be calculated. Parsing will be done by fie(3NEMO) , see the man pages for specific rules. [no default].
delcol=expression
expression is an integer range to denote the columns not to be copied to output, e.g. 3,5:10,14 would select columns 3, 5 through 10, and 14. The word all may also be used, in which case all columns from the input file(s) are deleted, and only [Default: not used, i.e. all columns, old and new, are output].
selfie=expression
Expression which must evaluate to 0 if a row is not to be output.
format=expression
A (printf(3) ) format specification with which the new columns are written to output [%g].
seed=seed
Integer initial seed in case random numbers have been used in the expression. If 0 is given, the time of the day will be used (see xrandom(3NEMO) for other special seed values) to initialize the random number generator. [Default: 0].
comments=f|t
Should comments be passed through to the output, or discarded. By default comments are ignored, but with comments=t they can be output unmodified also . Be careful with multiple input files that do not have the same amount of comment lines.

Example

To use tabmath as a function calculator, uses stdin (-) and stdout (-) as filenames for input and output, e.g.
tabmath - - "sqrt(%1)" allb
Then for every number you input, it returns the function value.

Here are two examples how to create a table from scratch:

    nemoinp 1:100:2 newline=t > odd.tab
    nemoinp 1::50   newline=t > one.tab
will contain a table with all odd numbers from 1 through 99, and
another table with all 1s. Both tables have 50 rows.

Most table programs in NEMO only operate on columns. 
No functions exist yet to operate
on rows, other than selecting rows for output (see selfie=).
As an example to look at a running derivative between two columns,
awk(1) can be used as follows:

% nemoinp 1:10 | tabmath - - ’%1*2+rang(0,0.1)’ seed=1 | diffs.awk
2 3.94938 1 2.08013 1.5 1.86925
3 6.07426 2 3.94938 2.5 2.12488
4 8.14975 3 6.07426 3.5 2.07549
5 9.96586 4 8.14975 4.5 1.81611
6 11.8817 5 9.96586 5.5 1.91584
7 13.9754 6 11.8817 6.5 2.0937
8 16.0503 7 13.9754 7.5 2.0749
9 17.9156 8 16.0503 8.5 1.8653
10 19.8898 9 17.9156 9.5 1.9742
% cat diffs.awk
#! /bin/awk -f
#
{
  if (NR > 1) {
    print $1, $2, xold, yold, ($1+xold)/2, ($2-yold)/($1-xold);
  }
  xold = $1;
  yold = $2;
}

Here is an example to select only rows from a table where column 4 is between values $a and $b:

   tabmath file - selfie="ifgt(%4,$a,iflt(%4,$b,1,0),0)"

Common Expressions

Although fie(3NEMO) expressions are quite rich, some common boolean constructions are hard to write down (ever written complex SQL expressions?). So, here are some common ones. We will use a notation where P1, P2, .. refer to the parameters 1,2,..., that match %1,%2,... in fie notation, and V1,V2,... to some arbitrary values (numbers), and R1 and R2 to the two return values depending if the boolean expression is true (R1) or false (R2):
(P1>V1) ? R1 : R2           ifgt(%1,V1,R1,R2)
(P1>V1 && P2>V2) ? R1 : R2    ifgt(%1,V1,ifgt(%2,V2,R1,R2),R2)
(V1<P1<V2) ? R1 : R2            ifgt(%1,V1,iflt(%1,V2,R1,R2),R2)

See Also

colrm(1) , expand(1) , awk(1) , cut(1) , paste(1) , tablines(1NEMO) , tabtranspose(1NEMO) , tabplot(1NEMO) , table(5NEMO)

jdb (http://www.isi.edu/~johnh/SOFTWARE/JDB/): a package of commands for manipulating flat-ASCII databases from shell scripts.

msort - http://billposer.org/Software/msort.html - sophisticated sorting program

starbas - http://cfa-www.harvard.edu/~john/starbase/starbase.html - an ASCII relational database for UNIX.

Author

Peter Teuben

Bugs

Parsing of newcol expression is done by nemoinp(3NEMO) , and is space sensitive. Parsing allows more than one value, e.g. "1 _+ 1" is interpreted as two values, instead of the number 2. "1+1" would be interpreted as 2. Hence multiple columns (with probably the wrong value) would be created in the first example.

If comments are passed on, processing multiple input files which do not have matching comment lines will cause the output to have odd lines.

Complex expressions where one component fails, fails the whole expression. An example is adding two gaussians with different widths. The one with the narrow gaussian component will cause an internal failure (exp() argument overflow). For example, the following expression to add two gaussians

fie="$a1+$b1*exp(-(%1-$c1)**2/(2*$d1**2)) + 
     $a2+$b2*exp(-(%1-$c2)**2/(2*$d2**2)) + 
     rang(0,$sig)"

can only be reliably be done as follows


nemoinp $x |
   tabmath - - "$b1*exp(-(%1-$c1)**2/(2*$d1**2))" |
   tabmath - - "$b2*exp(-(%1-$c2)**2/(2*$d2**2))" |
   tabmath - - "%1,%2+%3+$a1+$a2+rang(0,$sig)" all

Update History


18-May-88    V1.0 created    PJT
1-Jun-88    V1.1 name changed nemotable->tabmath    PJT
xx-jun-88    V1.2 added stride keyword    PJT
23-aug-88    V1.3 added in2 keyword, removed stride bug    PJT
27-oct-88    V1.4 multiple new columns and %0 reference allowed    PJT
10-nov-88    V1.5 allow tab;s also as column separators    PJT
18-feb-92    V2.0 turbospeed parsing now done by fie()    PJT
13-jun-98    V3.0 deleted stride/skip keywords, added selfie=    PJT
24-feb-00    document improved    PJT/VS
18-apr-01    V3.1 added comments=    PJT


Table of Contents