This section covers recommended formatting conventions for data input and output files. The OMS API I/O library does not depend on these conventions, but it works well with them.
Design motivations for the conventions include:
- Support for typical scientific data I/O such as tables and properties
- Low verbosity to enhance human readability
- Support for meta data
- Low barrier to data consumption by tools outside of OMS
- Permitting the definition of a simple API to programmatically read and write data.
There are two categories of data covered by the recommeneded conventions:
- Tables, containing tabular information
- Properties, referring to key/value property data
The Data I/O format is commonly defined for both tables and properties. The format is based on CSV structure that has been extended with meta tags. A file might contain any number of tables and properties.
Both types of information can be mixed in the same file and may occur multiple times. The definitions for tables and properties are similar, both supporting meta data.
- The data file complies fully to the CSV standard.
- The file name extension is csv, standing for "comma separated values". The file might be zipped and therefore would have the extension csz.
- A csv file might contain a table, a property section, multiples of each, or a mixture of both.
- A # symbol at the beginning of the line indicates a comment line.
- Empty lines are ignored.
Keywords
Keywords are used to indicate properties and tables in the file.
CSV Tags
Keyword | Name | Description |
@T | Table | Defines a new table |
@H | Header | Starts a header in a table |
@S | Section | Starts a new property section |
@P | Property | Starts a new property |
Note
- All of those keywords can be followed by optional meta data.
- Keywords are case insensitive (@T is equal to@t).
Meta data may always follow the property and table markups. There is one meta data entry by line. Such an entry may have a key/value pair (separated by a comma), or a single key with no value indicating the presence of a meta data entry.
The property section example below shows section level meta data supporting the whole "Parameter set" such as data, or createdBy, as well as key value pair property meta data such as description or single value properties such as public. It might be good practice to quote meta data values in general to account for potential commas, however it is not required.
Meta Data Examples (@S and @T)
Name | Description | Example |
CreatedBy | data set creation date | CreatedBy, "JCarlson" |
CreatedAt | user who created the data set | CreatedAt, "May 1st, 2008" |
Description | brief data set description | Description, "EFC climate file" |
VersionInfo | Version information (use in conjunction with version control system) | VersionInfo, "$Id:" |
SourceInfo | Source information (use in conjunction with version control system) | SourceInfo, "$HeadURL:" |
Properties
Properties are key/value pairs (KVP) that are aggregated in a section. There could be meta data for the whole section @S and also for each property @P. The example below shows a property section.
Property Example
@S, "Parameter"
CreatedAt, "Jan 02, 1980"
CreatedBy, Joe
# Single Properties
@P, coeff, 1.0
description, "A coefficient"
public
@P, start, "02-10-1977"
description, "start of simulation"
A property section starts with the @S keyword, followed by the name of the property section. It is followed by optional meta data. Meta data keys/values can be arbitrary, and may occur in any number. A single property starts with the property keyword @P, followed by the property name and the property value. Optional meta data may also follow a single property. The property section ends at the beginning of the next section or table or the end of the file.
Property Key/Value Substitution
Properties support internal key/value substitution. This feature helps organizing property sets more efficiently, An example is shown below. A directory property idir is defined and internally used by multiple files.
...
# Input file folder (variable)
@P, idir, "ccreek"
Description, "Data directory"
@P, ahumFileName, "${idir}/ahum.dat"
Description, "Absolute Humidity Data"
@P, gwFileName, "${idir}/hgeo.par"
Description, "Hydrogeology Data"
...
The expression ${<prop_key>} will be replaced with <prop_value>, if there is somewhere else within the same property set a property defined as @P, prop_key, prop_value.
Tables
Tables consists of columns and rows, and optional table meta data. Columns may have a type and optional meta data. Meta data is organized as a key/value pair. A table requires two key words, @T (Table) and @H (Table header). The @T keyword tags the start of a table definition, the @H tag starts a column definition.
Tables can be generated using any text editor. Spreadsheet tools usually do allow the export into a CSV file.
Table Example
# table example
@T, "Example DataSet"
CreatedAt, 5/11/06
CreatedBy, JackC
# Now, there is header information
@H, time,b,c
Type, Date,Real,Real
Format, yyyy-MM-dd,#0000.00,#000.0000
,2006-05-12,0000.00,001.1000
,2006-05-13,0001.00,002.1000
,2006-05-14,0002.00,003.1000
,2006-05-15,0003.00,004.1000
,2006-05-16,0004.00,005.1000
,2006-05-17,0005.00,006.1000
,2006-05-18,0006.00,007.1000
A Table consists of three main sections:
- The table header, indicated by @T, followed by the name of the table. The next lines may have table level meta data, one meta data entry per line. Meta data is optional.
- The table header is followed by the column header, indicated by the @H keyword. Next to this all the column names are listed. The next lines may contain column meta data, starting with the key, followed by the values for each column (Example above shows Type and Format for the columns).
- Data rows start with a comma ',' as the first character; values are comma separated.
A minimal table with no optional meta data looks like this:
@T, example data table
@H a, b, c
, 1,2,3
, 4,5,6
... more data