Table of Contents

Name

gaussfit - Least Squares and Robust Estimation

Synopsis

gaussfit [parameter=value]

Description

gaussfit is a least squares and robust estimator of an arbitrary parameterized model to observed data. This program is a slight NEMO adapation of the University of Texas program gaussfit.

This version of gaussfit can be run in two modes. In the first mode an existing model and environment file are given, which contain all the initial conditions to run the program. In this mode the program runs exactly as in the original gaussfit program. In the second mode the environment file is either created or modified from the remaining environment variables based program keywords (in NEMO style). This is triggered by using any (but the correct combination) of those remaining parameters This approach will not work in all gaussfit application, but are fine for small problems with a simple set of parameters.

This manual page merely serves as a reminder for the somewhat more efficient NEMO mode to run large amounts of fits, but it is highly recommended to read the GaussFit User Guide (see below).

The datafiles in gaussfit must be specially formatted (MIDAS-like) table format. Two useful tools that come with the gaussfit distribution are mkgf(1gaussfit) and cjoin(1gaussfit) . mkgf will add a gaussfit header interactively to files that have data in columnar format. cjoin will strip off leading lines, and extract columns from different datafiles and put them into a single file. See also tabs(1NEMO) for converting ascii tables to MIDAS format within NEMO.

Parameters

The following parameters are recognized in any order if the keyword is also given:
model=model_file
The input model filename, which contains the mathematical description of the model (see EXAMPLE below) to be used to fit the data, in a C-like programming language. Within NEMO the model files can also be located anywhere in the GAUSSPATH (unix) environment variable. No default.
env=env_file
The input environment filename. Initial parameters and relevant datafiles are described in this file, wich is a list of FITS-header formatted keyword=value pairs. This environment file can either be created from scratch (see remaining paramters below), or an existing one is used. If left blank, a name with an extension of model_file is replaced with ’env’.
data=data1,data2,...
Name of the datafile(s) in MIDAS format. Up to 99 datafiles can be used. See tabs(1NEMO) how to format file this way. No default.
params=p1,v1,p2,v2,...
Pairs of parameter names and values, separated by a comma. If this keyword is used to update an environment file, you only need to specify the pairs that will be different, if it’s the first time around, you need to supply initial values for all of the parameters in the model file. This parameter list translates into a parameter file, fed to gaussfit, whose name is derived by replacing the model_file extension with
fair=
The asymptotic relative efficiency (ARE) used by this Huber-type robust estimation method. Select only one value for any of the following fair, huber, tukey, minsum keywords. Their value should be in the range 0.8 - 0.95, the authors suggest 0.9 or 0.95. Note that fair or huber ~ 0.8 approaches the minsum method. If none of these four were selected, the standard least squares is used.
huber=
see above.
tukey=
see above.
minsum=
Median type estimator, using Barrodale & Roberts algorithm. Probably not good when more than one observation per equation of condition. Set to 0 or 1. See also above.
orm=
Use orthogonal regression M-estimate. It measures the metric for goodness-of-fit orthogonally to the fitted function. By default true, use this to turn it off. Although recommended if fair or tukey is used, it *must* be used for huber.
lambda=
Marquardt-Levenberg. To activate set to a non-zero value. Typically 0.0001.
factor=
Marquardt-Levenberg factor by which lambda is decreased when a new chi-squared is less than the old. To activate set to a non-zero value. Typically 0.1.
irls=
Two methods for iteration are provided. Newton’s method, and the method of iteratively reweighted least squares (IRLS). Set this parameter to true if you want to use IRLS since the default is Newton. Note: variances cannot be computed in IRLS mode. See also double= below.
double=
Two styles of iteration are provided. Single or double iteration.
iters=
Maximum number of iterations allowed. Default, if not used, will be 10.
tol=
Relative tolerance.
triang=
Set this to true if you want to attempt to use triangularize the matrix of conditions (keeps the size of matrix down).
prmat=
Set this to true if you want to see intermediate matrices.
prvar=
Set this to true if you want to see the correlation matrix and standard deviations (sigma’s) of the parameters. Files results_file.corr and results_file.cov will be created. Co-variances of the parameters are listed in the parameter file.
results=results_file
Output logfile for results. If not given, the name is derived from the model by replacing it’s extension with log.
ftn=
Set this to turn on debugging output to a file FTN, which contains a dump of the function interpreter. By default it’s off.
scale=
Scale paramter that gaussfit computes if a robust estimation was employed. Exact meaning to be published. Placed at output in the environment file.
sigma=
Variance of unit weight of the solution. Placed at output in the environment file.
reset=t|f
If set, all input files will be reset, and overwriting will be allowed. This applies to the parameter and environment file. Only useful in case the program in run in NEMO mode. Default: f.
report=t|f
If set, at the end of the run, the parameter file is read and the parameters and their errors (if available) are reported, one at a line.

Model File

Here is a sample model file for the linear least squares problem: (y=a+bx, with no errors in x):
parameter a, b;
observation y;
data x;
main()
{
        while(import())
                export(y - (a + b*x));
}

Data

Input datafiles are in MIDAS format: columns separated by a TAB, the first row containing the names of the columns, the second row their types, and subsequent rows form the data.

The names of columns must correspond to the observation and data variables in the model file, and variances and co-variances can be given in a column x_x and x_y (or y_x) resp.

An example how to create a MIDAS table from an ascii table which contains 4 columns:

    % tabs in=test.table out=test.tab col=x,y,x_x,y_y

Deficiences

The NEMO approach cannot easily handle cases where the parameters are indexed.

Fixing parameters can only be done by editing the model file, and either changing a parameter into a constant (this keeps the paramter in the parameter file, or params= keyword), or adding constraints to the model file. Examples are in the manual.

See Also

linreg(1NEMO) , tablsqfit(1NEMO) , tabs(1NEMO) , mkgf(1gaussfit) , cjoin(1gaussfit)

GaussFit: A System of Least Squares and Robust Estimation, USERS MANUAL, by William H. Jeffreys, Micheal J. Fitzpatrick., Barbara E. McArthur, and James E. McCartney. University of Texas at Austin.

Files


$NEMO/usr/tools/gaussfit/     V3.04 release
$NEMODAT/gaussfit             repository of some example model files
ftp://clyde.as.utexas.edu/pub/gaussfit/      official (anonymous ftp) release

Environment

The following UNIX environment variables are used by gaussfit:
GAUSSFIT      colon separated list of directories searched for model files

Author


Barbara McArthur (mca@astro.as.utexas.edu0)
        Source Code Copyright (C) 1987 by William H. Jefferys,
        Michael J. Fitzpatrick and Barbara E. McArthur
        All Rights Reserved.
Peter Teuben (this NEMO interface)

Update History


11-aug-92    first nemo version    pjt
xx-apr-94    v3.04 new improved compiler, wobble fix, more env,...     mca
14-Jul-94    (nemo) added second mode option to run gaussfit      pjt
26-may-96    installed the 3.53 version (sep 95)    pjt


Table of Contents