HTML automatically generated with rman
Table of Contents

Name

filestruct - binary structured file format

Synopsis


#include <stdinc.h>
#include "filesecret.h"         only for local code
#include <filestruct.h>

Format


 struct _disk_item {        Note: DOES NOT EXIST!!
    short int magic;
    char *tag;        (zero terminated)
    char *type;        (zero terminated)
    int  *dim;        (4 byte zero terminated, optional)
    char *data;        (optional)
 };
 struct item {
    string itemtyp;
    int    itemlen;
    string itemtag;
    int   *itemdim;
    byte  *itemdat;
    long   itempos;
 };

 struct strstk {
    stream  ss_str;
    item   *ss_stk[SetStkLen];
    int     ss_stp;
    bool    ss_seek;
    long    ss_pos;            /* only if RANDOM access allowed */
    itemptr ss_ran;            /* only if RANDOM access allowed */
 };

Description

filestruct is a method for storing data files largely consisting of character strings, booleans, and possible multi-dimensional arrays of data. This data may be structured in a hierarchical form similar to the Unix directory structure, although there are additional constraints imposed by the essentially linear nature of the storage media.

In modern parlor, the format resembles XML but where the data are stores in binary form. Libraries such as netcdf(3) and HDF5 overlap in ideas in this NEMO library. When NEMO was developed, in 1986, these libraries were not available yet.

The external data format (visualized by an otherwise non-existing struct _disk_item) is a tagged list of data items, each containing a magic number (also containing information if the item is a single number or an array), a tag name for identification and search purposes, a data type and optionally a dimension followed by a contiguous data stream. Certain data types (such as sets) do not have a dimension and data associated with them, but merely aid in structuring the data in hierarchical sets.

The is optional integer array *dim is zero terminated, it’s presence depends on the value of the magic header number. As said before, if the data type is a set or tes (see below), no data is present either.

The intermediary internal format is also called an item, and is defined by the struct item. In addition to the data-type the basic length of a data chunk, itemlen, within this item is obtained from a look-up table. Deferred input is achieved by storing the data in a block pointed to by itemdat if it is small enough, and keeping track of the file position where the data started otherwise. This is essentially the reason why pipes cannot be used in filestruct.

The actual internal format is governed how the application programmer uses the get_XXX and put_XXX routines (see filestruct(3NEMO) ).

Experimental Features

If compiled with -DRANDOM some limited random access to data within a data-item is possible.

If compiled with -DCHKSWAP the disk format is checked against little/big endian machines. Otherwise, data on disk exist in the host format, and no effort has been made to make it machine independant (e.g. IEEE floating points and twos-compliment integers). This is however expected in some future release.

Zeno Format

The zeno(1NEMO) package also used this format, but there are some subtle differences to be described.

Files


~/src/kernel/io       filesecret.[ch] filestruct.h
~/inc                  filestruct.h

See Also

tsf(1NEMO) , rsf(1NEMO) , csf(1NEMO) , filestruct(3NEMO) , zeno(1NEMO)

Author

Joshua Barnes, Lyman Hurd, Peter Teuben

Update History


dark-ages    V0.0 precurser (filestr)    JEB
xx-apr-87    V1.0 basic operators         JEB
xx-jul-87    V1.x added f-d coercion, deferred output    Lyman Hurd
xx-xxx-87    V2.0 new types, operators, external fmt      JEB
9-oct-90    V2.2 created man page, adding random access    PJT
20-nov-90    V2.2a merged old man page from Lyman into this one     PJT
16-may-92    V3.0 finalized random access                            PJT
6-jul -01    documented the new uNEMO       PJT
27-dec-2019    documented ZENO        PJT


Table of Contents