You are not logged in. Click here to log in.

Application Lifecycle Management

Search In Project

Search inClear

Tags:  not added yet
Model Developer and User Handbook

The purpose of this document is to provide guidance for component-oriented model development and application using the Object Modeling System in Version 3.0.

Table of Contents

Introduction

Concepts and Definitions

Model base

Components and Models

A component is representing a certain conceptual function in a simulation is the foundation for each model. A hydrological model for example usually needs components for handling input/output, has components representing processes of a hydrological system such as precipitation, interception and runoff, and has to realize general data processing functions such as reading climate data or parameter sets. The key for a proper model design based on components is a clean "separation of concerns" in a model. Components for a model hosted in OMS have to provide a certain aspect What qualifies a part of a model to become a component?
  1. A component has a certain and mostly one conceptual function in a model. It represents a physical process, a management action, a data gathering part, or the presentation of results to the user interface. Such functions need to be identified and separated from each other. Each of these aspects will result in a component.
  2. An identified component can be fully described regarding to its function, data requirements and data offerings. Therefore, the specification and later implementation of a component will be done with respect to its anticipated simulation context, but a tight dependency to this context is avoided. Later in the process of development, the component will be tested standalone using a test bed environment to prove its correct work.
  3. The component is general enough to be used in other models and applications. So designing and implementing it right from the beginning will eventually require more work at the beginning, but will definitively pay off when it gets reused and re purposed later.

By analyzing simulation models a classification of potential components can be made.

  • Scientific components implement methods and equations to estimate some physical phenomenon. Examples would be a component estimating amount of water evaporated from a certain land cover into, a component predicting the soil loss due to wind erosion, etc. Such components usually apply some mathematical function.
  • Scientific utility components support the analysis of models by providing statistical analysis methods such as descriptive statistics, frequency analysis, etc. Distribution generator components are used to provide data to scientific components.
  • Control components are responsible for managing the execution of a model. A Runge-Kutta Integration component, a Time management, or a Convergence criteria component are examples for this.
  • Data Input/Output components are providing data to other components in a simulation model. Such components could handle data transfer from databases or files to the model. Visualization components like graphs or spreadsheets are also falling under this category.

A component is written in the Java Programming Language. or in the FORTRAN programming language. Java is the preferred implementation language for components since you will experience all the benefits and advantages of this platform. The use of FORTRAN components enables the integration of legacy FORTRAN code as components into the system.

Developing Model Components

Component Structure

Parameter handling

A parameter is a more less constant value for a model run. It should stay constant during model execution, any change attempt within the running code should lead to an execution error. However a model parameter might be adjusted in a graphical user interface or in a parameter input file. To declare a parameter use the @Role annotation on a component field. The following input parameter is defined as double named coeff. The default value for it is 0.0
@Role(Role.Parameter)
@In
public double coeff;

Initialize on declaration to give this parameter a different default value.

@Role(Role.Parameter)
@In
public double coeff = 0.34;

If not overwritten via a setting from an external parameter file, this value would be used on execution.

Altering Parameter from external Files

A parameter can, of course, from an entry in a csd file. Lets suppose the package contains a parameter file with the following entry:

...
@P, coeff, "0.41"
...

Such an entry will overwrite the default value 0.34 above before model execution. If this entry would reside in a packaged parameter set you would be able to control the overwriting by an external file. Set the visibility like:

...
@P, coeff, "0.41"
public
...

An external parameter file might now overwrite coeff again with a different value. There are three variants of such modifiers:

Modifier Description
public allows optional modification of the parameter in an external parameter file
protected allows the real/only access of a parameter
provided required a parameter to be provided in an external parameter file

Creating a New Component

Integrating existing (native) code as a Model Component

Testing Model Components

Integrating Model Components into a Model

Provisioning Data for a Model Simulation

Calibrating and Validating the Model

Analyzing Parameter Sensitivity in a Model

Deploying Models

Single Jar Deployment

This document describes the process of deploying a simulation developed with the Object Modeling System for the purpose of using it outside of the OMS IDE.

Use cases:

  • hand over a simulation to a user who just wants to apply the model
  • a model will be used in a deployment environment such as a web server, etc.
  • a simulation should be certified for production by an authorized person or institution, the simulation can be explored since it is self-documenting with respect to its components, model, and parameter files.

A simulation is deployed as a Jar file. This is called a Simulation Jar. This simulation jar has the following characteristics

  • It contains all the resources that are required by that simulation such as the simulation file, the model, the components, default parameter sets, libraries.
  • It also contains all the OMS runtime classes to execute the simulation.
  • It contains description about the origin and version of those resources.

The simulation jar is self contained, no other external classes are required to run the simulation, everything needed is packaged together. The simulation jar is also 'sealed'. Only classes from within the simulation file are being used for execution, no external code cannot be injected into the simulation. This is an important security feature.

Executing a Model as a Simulation Jar

Running a simulation jar is easy. As a minimal requirement the installation of the JRE version 1.6+ installed. At the command prompt you execute the jar file directly by passing the -jar option to the Java JVM and the name of the simulation jar.

$ java -jar EFcarson.jar 

The above command verifies the digital signatures (if present) and executes the simulation with its default settings.

Exploring a deployed simulation

The content of a Simulation Jar can be explored. By passing the ? at the end of the execution command line, all relevant meta data will be printed out.
$ java -jar EFcarson.jar ?

Object Modeling System, CLI Simulation & Runtime
  Usage: java -jar null.jar [options]
    options: -P<param file>  Parameter file(*.csp)
             -O<output dir>  Output directory
             -?              this output

  Built From: C:\od\oms\work21\prms\scenarios\EFCarson.jsa
  Built At  : Wed Oct 22 15:30:03 MDT 2008
  Built By  : od
  Built On  : Windows XP - x86 - 5.1

  Simulation: EFCarson
    Model:            \META-INF\oms\models\prms_radpl.jma
    Parameter Files:  [\META-INF\oms\data\efcarson.csp, 
                       \META-INF\oms\data\efcarson_dates.csp, 
                       \META-INF\oms\data\efcarson_files.csp, 
                       \META-INF\oms\data\efcarson_luca.csp]

Version Information of Simulation Resources

The Simulation Jar's manifest file contains all the resources that belong to this simulation. If there is version information available within the components, the model and the parameter files at the time of Simulation Jar's creation, it will be carried over into the Simulation Jar's manifest file.

Component Versioning

To add version information to a component you need to add the @VersionInfo(...) annotation to the component. This annotation takes a string argument, that somehow describes the version of this component.

Example Using custom version information:

import org.oms.model.data.annotation.*;
...

@VersionInfo("1.0")

public class NewOutput implements Stateful, AttributeLookup {
...

A developer is responsible for providing meaningful version info. This is might be tedious for large projects to manage manually. A version control system can automate here. The example below shows version info using VCS keyword substitution.

import org.oms.model.data.annotation.*;
...

@VersionInfo("$Id: NewOutput.java 58 2008-10-16 17:36:05Z david $
   $HeadURL: https://colab.sc.egov.usda.gov/svn/PRMS/trunk/prms/src/gov/usgs/prms/NewOutput.java $")

public class NewOutput implements Stateful, AttributeLookup {
...

This example uses Subversion keyword substitution to provide more comprehensive version information such as the source file name, the revision number of the latest modification, the time of the commit and the user. It also shows here the head URL of the file, the repository and its location. This example will lead to a manifest entry within the EFCarson.jar file:

Name: gov/usgs/prms/NewOutput.class
OMS-Class: Component
Version: $Id: NewOutput.java 58 2008-10-16 17:36:05Z david $ $HeadURL:
  https://colab.sc.egov.usda.gov/svn/PRMS/trunk/prms/src/gov/usgs/prms
 /NewOutput.java $

The manifest section entry lists the compiled OMS component, and the version attribute as being defined on the component (Example 2). With such information someone should be able at later time to recover and trace back the component source.

Parameter File Versioning

Parameter file versioning is done using supported the meta data

Model Versioning

Within the graphical OMS ModelEditor, each model has a property Version that can be set in the Property window. This information will be put into the Simulation Jar file's manifest.mf file as a section attribute.

Name: META-INF/oms/models/prms_radpl.jma
OMS-Class: Model
Version: 1.0

Any string information can be used here. Note, that you cannot put VCS keywords in here, since the model is stored as a binary format.

Digitally signing a Simulation

Once a Simulation is deployed as a single jar file, it can be digitally sign with an electronic "signature". A digital signature ensures the integrity of the developed simulation. Once signed a simulation jar cannot be altered: Components, default parameter settings, and the Model cannot be switched or patched. The signature protects the investment to develop a complex simulation setup. However you can overwrite public parameter values, an can control output generation.

Creating a self signed certificate

You have to have install the Java SDK from http://java.sun.com/j2se/downloads.html; the tools have to be in your path

Step 1 - Create a key:

keytool -alias keyname -genkey

This will create a new keystore (usually $HOME/.keystore) if not present or add to it.

Step 2 - Sign the jar file using the key - Sign mindterm.jar:

jarsigner mindterm.jar keyname

Importing an issued certificate

You need to import ...
$ keytool -import -alias od  -file OD.cer

Validating the Integrity of a Simulation

Once a Simulation jar is signed it gets verified on execution:

$ java -jar EFCarson.jar

If verification fails, execution will not happen.

You can also verify the Simulation Jar with the jarsigner tool

$ jarsigner -verify  EFCarson.jar
jar verified.

Warning: 
This jar contains entries whose signer certificate will expire within six months. 

Re-run with the -verbose and -certs options for more details.

To get more details on certain component signatures:

$ jarsigner -verbose -certs -verify  EFCarson.jar

...

smk     1536 Thu May 24 10:17:22 MDT 2007  gov/usgs/prms/PrecipKrig.class

      X.509, CN=Olaf David, OU=CSU, O=CSU, L=FC, ST=CO, C=US (od)
      [certificate will expire on 1/14/09 10:44 AM]

smk     1593 Thu May 24 10:17:22 MDT 2007 gov/usgs/prms/Obs.class

      X.509, CN=Olaf David, OU=CSU, O=CSU, L=FC, ST=CO, C=US (od)
      [certificate will expire on 1/14/09 10:44 AM]
...

  s = signature was verified 
  m = entry is listed in manifest
  k = at least one certificate was found in keystore
  i = at least one certificate was found in identity scope

These examples show different levels of protecting a Simulation jar using digital signatures

Conclusion

The CLI Deployment of OMS Simulations is easy to perform and has the following features and benefits.

  • Simulations can be packaged into an executable simulation jar.
  • The simulation jar contains all code, data, and resources as defined within the OMS IDE to run the simulation
  • To run a simulation only a Java Runtime Environment and the Simulation Jar are needed, no OMS installation is required.
  • Simulation jars are sealed and can be digitally signed. Therefore a deployed simulation jar is secure and cannot be compromised
  • Simulation jars carry all information about the origin of resources that make up this simulation such as the version of components, data sets and the model. All sources can be traced back, if version info is present.

Webservice Container

Data Input Output (References)

OMS can use data a CSV format for data input and output. There are two type of informations that adhere to this:

  • Tables, containing tabular information
  • Properties, referring to key/value property data

Both types of information can be mixed in the same file and may occur multiple times. The definitions for tables and properties are similar, both support meta data.

  • The data file complies fully to the CSV standard.
  • The file name extension is csd, standing for "comma separated data". It might be zipped, and would then have the extension csz.
  • A csd file might contain a table or property section, or multiple of those, or a mixture of both.
  • A # symbol at the beginnig of the line indicates a comment line.
  • Empty lines are ignored.

Keywords

Keywords are used to indicate properties and tables in the file.

KeywordName Description
@T Table Starts a new table
@H Header Indicates the table header start
@S Section Start a new property section
@P Property Single property definition

All of those keywords can be followed by optional meta data. Keywords are case insensitive (@T is equal to@t).

Properties

Properties are key/value pairs (KVP) that are aggregated in a section. There could be metadata for the whole section @S and also for each property @P. The example below shows a property section.

Example

@S, "Parameter"
date, "Jan 02, 1980"
createdBy, Joe

# Single Properties 
@P, coeff, 1.0
description, "A coefficient"
public

@P, start, "02-10-1977"
description, "start of simulation"

A propery section starts with the @S keyword, followed be the name of the property section. It is followed by optional meta data. Meta data keys/values can be arbitrary, and may occur at any number. A single property starts with the property keyword @P, followed by the property name and the property value. Optional meta data may also follow a single property. The property section ends at the beginning of the next section or table or the end of the file.

Tables

Tables consists of columns and rows, and optional table meta data. Columns may have a type and optional meta data. Meta data is organized as pair key, value.

A table requires two key words, @T (Table) and @H (Table header). The @T keyword tags the start of a table definition, the @H tag starts a column definition.

Tables can be generated using any text editor. Spreadsheet tools usually do allow the export into a CSV file.

Example

# table example
@T, "Example DataSet"
Created, 5/11/06
Author,  OlafDavidA
# Now, there is header information
@H,     time,b,c
Type,   Date,Real,Real
Format, yyyy-MM-dd,#0000.00,#000.0000
,2006-05-12,0000.00,001.1000
,2006-05-13,0001.00,002.1000
,2006-05-14,0002.00,003.1000
,2006-05-15,0003.00,004.1000
,2006-05-16,0004.00,005.1000
,2006-05-17,0005.00,006.1000
,2006-05-18,0006.00,007.1000

A Table consists of three main sections:

  • The table header, indicated by @, followed by the name of the table. The next lines may have table level meta data, one meta data entry per line. Meta data is optional.
  • The table header is followed by the column header, indicated by the @H keyword. Next to this all the column names are listed. The next lines may contain column meta data, starting with the key, followed by the values for each column (Example above shows Type and Format for the columns).
  • Data rows start with a ',' as the first character; values are comma separated.

A minimal table with no optional meta data looks like this:

@T, example data table
@H a, b, c
, 1,2,3
, 4,5,6
... more data

Such a table even looks O.K. when opened in Excel

@TExample DataSet
Created5/11/06
AuthorOlafDavidA
@Htimebc
TypeDateRealReal
Formatyyyy-MM-dd#0000.00#000.0000
2006-05-120000.00001.1000
2006-05-130001.00002.1000
2006-05-140002.00003.1000
2006-05-150003.00004.1000
2006-05-160004.00005.1000
2006-05-170005.00006.1000
2006-05-180006.00007.1000

Meta Data

Meta data may always follow the property and table markups. There is one meta data entry by line. Such an entry may have a key/value pair (separated by a comma), or a single key with no value indicating the presense of a meta data entry.

The property section example below, shows section level meta data supporting the whole "Parameter set" such as data, or createdBy, as well as key value pair property meta data such as description or single value properties such as public.

It might be good practice to quote meta data values in general to account for potential commas, however it is not required.

Data Types

The are the follwing types available:
  • Date
  • Real
  • Boolean
  • Integer
  • String

If no Type information is given, the assumed type is String.

Data Formattiing

Data in each column has to have a Type and may have a Format. If a column has the type date, it is required to have a Format meta data record. The Format information is being used by OMS either to parse the file is used as input, or it was used to write it out. For numerical data the format is optional if the data values can be parsed with no problems (means: no localized formatting)

Data Access Permissions

Format Patterns according to Sun's specification must be used:

If a pattern contains a comma, the format string and all data values for this column need to be quoted.

Annotations (Reference)

This document describes a core concept and a reference implementation of a "Next Generation Modeling Framework" NGMF. This is a working title! The result of this research will be merged into the next generation of the Object Modeling System to provide a as the core runtime and execution environment.

Keywords Modeling Framework, Non-invasive Framework, OMS

Motivation

Using frameworks for the development of scientific environmental simulation models became an important ???

  • Model/Component developer burden, easy scalability,

Concepts

  1. NGMF is component-based. We aim for only minimal requirements to call a plain java object a NGMF component. Existing legacy classes are allowed to keep their identity, which means that once a component has been introduces into NGMF it is still usable outside of NGMF.
  2. NGMF is non-invasive. It minimizes the burden to a component/model developer to get code into the framework by not imposing an API to a developer. There is almost no learning curve, existing Java code has not to be changed. There are no framework data types to learn and use, there are no communication patterns to comprehend to parallelize the model. All those features resulted from experiences developing modeling frameworks in the past and looking at the
  3. NGMF is multithreaded. The default execution is multithreaded. Sequential execution is just a specific case of multithreaded execution where the dataflow requires the sequential execution of components. If data flow allows it, components are being executed in parallel. No explicit thread coding is needed to make this happen.
  4. NGMF is dataflow driven.

Components

The concept of a component refers

What is expected from a component modeling framework.

  • Component coupling
  • Component execution
  • Component unit testing

All of those functions have to be as easy as possible for a model developer.

A first component example

import ngmf.ann.*;

@Description("Daylength computation.")                               (1)
@Author(name="Joe Scientist")
public class Daylen  {                                               (2)
    
    private static final int[] DAYS = {
        15, 45, 74, 105, 135, 166, 196, 227, 258, 288, 319, 349
    };
    
    @In public java.util.Calendar currentTime;                       (3)
    
    @Role("Parameter")                                               (4)
    @Range(min=-90, max=90)                                          (5)
    @In public double latitude;              
    
    @Range(min=9, max=15)
    @Out public double daylen;                                       (6)
    
    @Execute                                                         (7)
    public void execute() {                                          (8)
        int month  = currentTime.get(java.util.Calendar.MONTH);
        
        double dayl = DAYS[month] - 80.;
        if (dayl < 0.0)
            dayl = 285. + DAYS[month];
        
        double decr = 23.45 * Math.sin(dayl / 365. * 6.2832) * 0.017453;
        double alat = latitude * 0.017453;
        double csh = (-0.02908 - Math.sin(decr) * Math.sin(alat)) 
                /(Math.cos(decr) * Math.cos(alat));
        
        daylen = 24.0 * (1.570796 - Math.asin(csh)) / Math.PI;
    }
}

Component Metadata

Annotations are being utilized to specify resources in a class that relate to its use as a component for NGMF. Such annotations might have different importance and relevance to different aspects of the use of the component within the framework. The same Annotations can also play different roles depending in the context use.

  • Documentation Annotations Those annotations are being used for documentations, presentation layers, databases, and other content management system. This is required meta data for component publication, but optional for execution.
  • Execution Annotations Such meta data is essential information for component execution (in addition to the documentation purpose). Theu describe method invocation points and data flow between components. This is required meta data.
  • Supporting Execution Annotations Such meta data supports the execution by providing additional information about the kind of data flow, physical units, and range constraints that might be used during execution. This is optional meta data.

Why Annotations? Annotations are a Java feature since 1.5. They are an addon to the java language to allow for custom and domain specific markups of language elements. "They do not affect directely the class semantics, but they do affect the way classes are treated by tools" (??). Annotations allow for the extension of the Java programs with meta information that can be picked up from sources, classes, or at runtime. They respect also scopes and and are supported by Java IDE's with code completion and syntax higlighting.

Meta Data Overview

Class Field Method
@Description @Description @Execute
@Author @In @Initialize
@Bibliography @Out @Finalize
@Status @Unit
@VersionInfo @Range
@Keywords @Role
@Label @Bound
@SourceInfo @Label

@Description

The description annotations takes a String argument that provides component description information, such as a brief abstract about its purpose, scientific background, etc.

@Description(“Circle Area Calculation.”)
public class CircleArea {

    @Description (“Radius”)
    @In public double r;    
    ...

}	

The @Documentation annotation is being used for automatic capturing the purpose of a component for archiving, online presentation, database integration, and component selection during the process of model building.

This is optional meta data.

@Author

The optional Author annotation provides information about the authorship of the component. The annotation fields name, org, and contact will give more details.

Author elements

  • name : the name of the author(s)
  • org : organizational information
  • contact : some contact information such as phone number or email address.
@Author( 
    name="Joe Scientist", 
    org="Research Org", 
    contact="joe.scientist@research-org.edu")
public class HamonET 
     ...
}

@Bibliography

Attach a @Bibliography annotation to a component to refer to Literature background, Web sites that contain detailed documentation, etc. This annotation does have the same purpose like a Bibliography list in a scientific paper.

Each reference is a separate string, multiple references are comma separated. This is optional metadata and can be used to document components.

@Description(“Circle Area Calculation.”)
@Bibliography(“Journal of Geometry, Vol.1, p..”)
public class CircleArea {  

   ...

}	

@Status

Component with status information. A status is a component quality indicator.

@Description(“Circle Area Calculation.”)
@Status(Status.TESTED)
public class CircleArea {  
    ...
}	

Component have an optional status. A developer can specify the level of completeness or maturity of a component with this tag. Predefined values DRAFT, SHARED, TESTED, VALIDATED, CERTIFIED This is optional metadata and can be used to classify, verify stored components in a repository

@VersionInfo

The VersionInfo annotation takes one string argument that represents the version of this component. A developer might use version control supported keyword substitution for this. The example below shows the use of the Subversion keyworks $Id to provide revision number, modification time, and committer name as version information.

 @VersionInfo(“$Id: ET.java 20 2008-07-25 22:31:07Z od $”) 
	public class ET {
 }	

@VersionInfo might contain more than just a number. Version control systems such as CVS, Subversion, or Mercurial provide keyword substitution that present revision number, last modification time, or developer id. @VersionInfo is optional but is good practice. Component repositories can use and present this information

@SourceInfo

The SourceInfo annotation captures information about the source. This should be some hint about aource availability, maybe the source location or some contact information. The example below shows the use of subversion's keyword substitution for the head Url of a source file. This can also point to a specific tagged version with a repository.
@SourceInfo(“$HeadURL: http://www.test.org/repo/ET.java $”) 
	public class ET {
	}	

@SourceInfo provides some link to the source. Version control systems such as CVS, Subversion, or Mercurial provide keyword substitution that fills in the Repository URL. @SourceInfo is optional. Component repositories can use and present this information

@Keywords

Tag the Component with the @Keywords annotation to characterize it. This annotation does have the same purpose like a Keyword list in a scientific paper. This is optional metadata and can be used to index, search, and retrieve archived and stored components. It is optional meta data.
@Description(“Circle Area Calculation.”)
@Keywords(“Geometry, 2D”)
public class CircleArea {  ...
}

@Label

Simple Example component with label information. Labels relate to ontologies (label is an OWL annotation)
public class Calc {
	  @Label(“latitude”)  
          @In public double lat; 
    ...
}	

Labeling a field or component offers alternative names. Labels might be used to relate components or fields to ontologies. Labels are optional.

@In

The In annotation on a field specifies it as input to the component. The field has to be public as noted earlier. This that within the Execute method there is read access to the field.
    ...
    @In public double latitude;
    ...

@Out

The Out annotation on a field specifies it as output of the component. The field has to be public and the Execute method will write to this. NGMF used uses this field annotation to connect to another In field of another component.
    @Out public double daylen;

@Range

The Range annotation is supporting meta data for an In or an Out field. If present it defines a min/max range in wich the value of the field is valid. It is up to the execution runtime to handle the range information. Violating ranges might lead to execution abortion or just a warning message. Another use of the range information would be in component testing.
    @Range(min=-90, max=90)
    @In public double latitude;

In the example above the latitude value can only be in the range of -90 to +90.

@Role

The Role annotation gives an In or Out tagged field a certain meaning within the modeling domain. This allows someone who reads the component source code or is using a builder tool that respect this annotation to present categorized views on field data. Such categories might be "Parameter", "Variable", "Output", "Input", and others. The Role annotation takes the category as a String parameter. categories can be freely defined.
    @Role("Parameter")
    @Range(min=0, max=90)
    @In public double latitude;

Now the latitide field is "tagged" as Parameter.

@Unit

A @Unit annotation attached information about a physical unit to a component field that is tagged as In or Out.
public class Calc {
    
    @Unit(“degree”)  
    @In public double latitude; 
    ...
}	

Unit information for IO fields Usually used for scalars and arrays. Allows frameworks to support unit checking/validation and conversion Optional Meta data.

@Bound

A bound defines a binding to another field. This could be a dimension for an array.

public class ET {
   
    @Bound(“nsim”)
    @In public double[] jh_coeff;
    @In public int nsim;    

    ...
}	

A Bound annotation takes a string argument. It allows a GUI to present dependencies between fields. This is optional meta data.

Jh_coeff has the named dimension nsim.

@Execute

This method provides the implementation logic of the component where the input is being transformed to output.
public class Component {         
    @Execute                                 
    public void executemethod() {                
        // execute code here    
    }
}	 

Name the execution method any name you want, but annotate it with @Execute The execute methods has to be non-static, public, void, no arguments. This is required meta data!

@Initialize

In this method the internal state of a component should be initialized. (e.g. opening a file for reading)

public class Component {         
    @Initialize				         
    public void start() {                
       // initialization code     
    }
}	 

Name the execution method any name you want, but annotate it with @Initialize The init methods has to be non-static, public, void, and has no arguments. This method gets called once after Component instantiation and before the first execution. This is optional meta data.

@Finalize

This method provides the notion of a final cleanup after model execution (e.g. closing a DB connection)

public class Component {         
    @Finalize				         
    public void cleanup() {                
        // execute code here    
    }
}	 

Name the finalization method any name you want, but annotate it with @Finalize The method has to be non-static, public, void, and has no arguments. Finalize overlaps with Java’s finalize() method that gets called from the garbage collector. This is optional meta data

Metadata Representation

The following sections introduce

Embedded Metadata using Annotations

 import ngmf.ann.*;

 public class Daylen {

    static final int[] DAYS = {
        15, 45, 74, 105, 135, 166, 196, 227, 258, 288, 319, 349
    };
     
    @Range(min=6, max=18)
    @Out public double daylen;
 
    @In public Calendar currentTime;
 
    @Role(“Parameter”)    
    @Range(min=-90, max=90)
    @In public double latitude;
 
    @Execute public void execute() {
        int month  = currentTime.get(Calendar.MONTH);
        double dayl = DAYS[month] - 80.;
        if (dayl < 0.0)
            dayl = 285. + DAYS[month];
        
        double decr = 23.45 * Math.sin(dayl/365.*6.2832)*0.017453;
        double alat = latitude*0.017453;
        double csh = (-0.02908 - Math.sin(decr) * Math.sin(alat))  
                         /(Math.cos(decr) * Math.cos(alat));
        daylen = 24.0 * (1.570796 - Math.asin(csh)) / Math.PI;
    }
 }

Attached MetaData using Annotations

The following Listing show a alternative implementation of the Daylen component. It was split into two parts, (i) a pure computational component class Daylen.java and (ii) the component metadata class DaylenCompInfo.java . Only the latter one now has meta data dependencies to NGMF.

DaylenCompInfo.java

 
 import ngmf.ann.*;

 public abstract class DaylenCompInfo {

    @Range(min=6, max=18)
    @Out public double daylen;

    @In public Calendar currentTime;
    
    @Role(“Parameter”)  
    @Range(min=-90, max=90)
    @In public double latitude;
  
    @Execute 
    public abstract void execute();

 }

As a rule, an attached component metadata class has the same name like the compoment but ends with CompInfo. This class has to be public and abstract. It duplicates all the relevant fields and methods that should be annotated for NGMF. The methods should all be abstract. It is important to use the same spelling for fields and methods.

Daylen.java


 public class Daylen {

    static final int[] DAYS = {
        15, 45, 74, 105, 135, 166, 196, 227, 258, 288, 319, 349
    };

    public double daylen;
    public Calendar currentTime;
    public double latitude;

    public void execute() {
        int month  = currentTime.get(Calendar.MONTH);
        double dayl = DAYS[month] - 80.;
        if (dayl < 0.0)
            dayl = 285. + DAYS[month];
        
        double decr = 23.45 * Math.sin(dayl/365.*6.2832)*0.017453;
        double alat = latitude*0.017453;
        double csh = (-0.02908 - Math.sin(decr) * Math.sin(alat))  
                         /(Math.cos(decr) * Math.cos(alat));
        daylen = 24.0 * (1.570796 - Math.asin(csh)) / Math.PI;
    }
 }

There are pro and cons for using embedded and attached component metadata. External meta data enables clean and neutral computational components parts with no framework dependency. However, two separate files have to be managed and have to kept in sync while doing component development.

Attached meta data using XML

@Range(min=10, max=20) <xmlns:temp="http://oms/tmp"> <xmlns:latitude="http://oms/tmp">

<temp:Range min='10' max='20'/> <temp:Description> Temperature </temp:Description>

Execution Concepts and Listening

Unit Conversion

Execution Logging

Integrations

Running the model as a Webservice (JAX)

Modeling Framework interoperability (OpenMI, OMS)

Cluster Execution (Terracotta)

Running the model in a Compute Cloud (EC2)

Component Testing (JUnit)

Accessing native code in Fortran/C (JNA)

This sections demonstrates the integration and access of libraries written in languages other than Java into the system. In the scientific community common languages are C/C++/Fortran; Python is emerging. The approach used here is using the JNA library (Java Native Interface). It does not belong to NGMF, however it offers an appealing simplicity and transparency for its use in a framework.

The Example below shows a Fortran implementation of HamonET. It is actually using the Fortran 2003 BIND and VALUE keyword.

!     
! File:   ftest.f90
! Author: od
!
FUNCTION potET(daylen, temp, days) BIND(C, name='hamon')
   REAL*8,VALUE :: daylen,temp
   INTEGER*4,VALUE :: days
   REAL*8 :: potET
   REAL*8 :: Wt,D2
   
   Wt = 4.95 * exp(0.062 * temp) / 100.0
   D2 = (daylen / 12.0) * (daylen / 12.0)
   potET = 0.55 * days * D2 * Wt
   print *, Wt
   if (potET <= 0.0) then
       potET = 0.0
   endif
   if (temp <= -1.0) then 
       potET = 0.0
   endif
   potET = potET * 25.4
END

A Java Component is still required. But it now more lightweight and proxies ot the Fortran function only. As defined by JNA the Java code binds directely to a Dynamic Linked Library (Dynamic Shared Object) on any major OS with a simple Java interface. No JNI source generation or some othere source bridge building is required. You need to create the DLL with your favorite IDE or just a makefile.

import com.sun.jna.Library;
import com.sun.jna.Native;
import ngmf.ann.Execute;
import ngmf.ann.In;
import ngmf.ann.Out;
import java.util.Calendar;

public class HamonET {
    // the number of days per months
    final static int[] DAYS = {
        31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31
    };
    
    @In public double temp;
    @In public double daylen;
    @In public Calendar currentTime;
    @Out public double potET;

  // Interface binding 
    interface ETLib extends Library {                                    (1)
        double hamon(double daylen, double temp, int days);
    }

    // mapping the ETLib interface to "libF_ETLib.so" or "F_ETLib.dll"
    ETLib etlib = (ETLib) Native.loadLibrary("F_ETLib", ETLib.class);    (2)

    @Execute
    public void executeNative() {
        int month = currentTime.get(Calendar.MONTH);    
        potET = etlib.hamon(daylen, temp, DAYS[month]);                  (3)
    }
}

  • (1): Definition of a Java interface that lists the 'hamon' function with its Java signature. Needs to extends the JNA Library interface. In this example there is only one function to expose.
  • (2): Binding of "libF_ETLib.so" (UNIX) or "F_ETLib.dll" (WIN) to the ETLib interface as an instance variable.
  • (3): Calling the native hamon funtion with the native data types of this component.

The C version of the very same HamonET is shown below. Within the HamonET Java component, only the name of the Library needs to be changes, the binding interface and its use in Java will remain the same.

/* 
 * File:   testc.c
 */
double hamon(double daylen, double temp, int days) {
    double Wt = 4.95 * exp(0.062 * temp) / 100.;
    double D2 = (daylen / 12.0) * (daylen / 12.0);
    double potET = 0.55 * days * D2 * Wt;
    if (potET <= 0.0) {
        potET = 0.0;
    }
    if (temp <= -1.0) {
        potET = 0.0;
    }
    potET *= 25.4;
    return potET;
}

For further details on how to manage Java and DLL/Shared Objects see this manual.

Conclusions

  • No Overlapping feature set, NGMF is orthogonal to JNA, JUnit, ....
  • Java is an excellent language HUB

Native Interoperability

FORTRAN 90/95 Interoperability

This section will introduce the use of JNA (Java Native Architecture) for direct Java/FORTRAN interoperability.

JNA has been originally developed to allow for easy Java and C/C++ communication. It does not burden the developer with JNI management and other intermediate files/APIs to In contrast to JNI wich supports static interoperability JNA uses dynamic dispatching at runtime to connect to native DLL directly from Java. JNA's design aims to provide native access in a natural way with a minimum of effort. No boilerplate or generated code is required. While some attention is paid to performance, correctness and ease of use take priority.

Examples for C/C++ are available, however the use of FORTRAN within the scientific community is as much as important. It can be achieved with the 'out-of-the-box' JNA library. The objective of this section is to show how to craft, compile, and link FORTRAN code to be accessible directly from Java using JNA.

The following sections show different examples of JNA/FORTRAN interoperability. For general information about JNA and C/C++ examples please look at the JNA website

Calling a FORTRAN Function/Subroutine with scalar arguments by value from Java.

The following example function takes two arguments and returns their product.

  • It uses the BIND keyword to provide for a C name binding. In Java/JNA this function can be called under that name.
  • The function parameter are declared as value parameter. If omitted, a and b would be passed by reference.
FUNCTION mult(a, b) BIND(C, name='foomult')
    INTEGER,VALUE :: a,b
    INTEGER :: mult

    mult = a * b
END FUNCTION

The FORTRAN function above can be referenced and fetched using JNA:

interface F95Test extends com.sun.jna.Library {
    F95Test lib = (F95Test) Native.loadLibrary("F90Dyn", F95Test.class);
        
    int foomult(int a, int b);
}

  • The FORTRAN function resides in file libF90.dll, that is accessible in the jna.library.path.
  • The static call Native.loadLibrary belongs to the JNA API and binds all interface methods as specified in F95Test to their counterparts in libF90.dll.
  • The Java function uses the BIND name. This solves naming problems that results from different handling of names in object files/dlls with respect to underscoring. Using BIND}] is highly recommended, since it ensures a consistent external name for the function/subroutine regardless of the compiler being used and its location within a module.
  • Since function arguments ate passed in by value, regular native int types can be used within the Java method prototype. However assigning new values to a and b won't be propagated to the caller.
The method can now be called like this:
...
int result = F95Test.lib.foomult(20, 20);
assert result ==  400;
...
For more details on compilation/linking see further below.

Calling a Function/Subroutine with scalar arguments by reference.

To call a subroutine with arguments by reference, you shall not use the VALUE keyword on FORTRAN argument declaration. Now you can assign new values to the arguments, that will be later visible to Java.
SUBROUTINE ffunc(a, b) BIND(C,"reffunc")
    INTEGER :: a,b
    a = 3
    b = 5
END SUBROUTINE
The Java interface method needs to be modified to support call by reference via the JNA API ByReference classes.
...
void reffunc(ByReference a, ByReference b);
...

The reffunc subroutine will be called as follows:

...
IntByReference a = new IntByReference(0);
IntByReference b = new IntByReference(0);
F95Test.lib.reffunc(a, b);
assertEquals(3, a.getValue());
assertEquals(5, b.getValue());
...

Now you create the int reference objects, pass them into reffunc and retrieve the values with .getValue().

Array Arguments

Single and Multidimensional arrays can be handled in JNA/Java and FORTRAN. Like with Strings, the length of the array has to be passed in with additional arguments.

foo.f95:'

SUBROUTINE inc(arr, len) BIND(C, name='fooinc')
    INTEGER,DIMENSION(len) :: arr
    INTEGER,VALUE :: len
    INTEGER :: i

    DO i = 1, len
        arr(i) = arr(i) + 30
    END DO
END SUBROUTINE

SUBROUTINE arr2d(arr, m, n) BIND(C, name='arr2d')
    INTEGER,DIMENSION(m,n) :: arr
    INTEGER,VALUE :: m
    INTEGER,VALUE :: n
    INTEGER :: i,j

    DO i = 1, m
        DO j = 1, n
            arr(i,j) = arr(i,j) + 1
        END DO
    END DO
END SUBROUTINE

The examples above show the declaration and the use of a one and two dimensional array as subroutine arguments. The array is dimensioned by the extra parameter, they are passed in as value arguments.

The JNA/Java declaration part is shown below. Note that the multidimensional array, has to be one-dimensional in Java. FORTRAN will lay it out correctly by using the dimension lengths that are passed in.

interface F95Test extends Library {
  ...
  void fooinc(int[] arr, int len);
  void arr2d(int[] arr, int m, int n);
  ...
}

The use if the one dimensional array is pretty simple. The other example required a bit management on the java side, that is not shown here.

//1D
int[] a = {1, 2, 3, 4, 5};
lib.fooinc(a, a.length);
assertArrayEquals(new int[]{31, 32, 33, 34, 35}, a);

//2D
int[] a = {1, 2, 3, 4, 5, 6};
lib.arr2d(a, 3, 2);
assertArrayEquals(new int[]{2, 3, 4, 5, 6, 7}, a);

If a real Java multidimensional array needs to used in FORTRAN, it needs to be flattened into 1D, or you use an access method in Java to use a 1D Array in a 2D way. Not pretty but it works!

String Arguments

String arguments are always special, since Strings are represented differently in almost all languages. In FORTRAN, you declare a string argument as follows, note that the size of the string has to be passed in as an additional argument.

The following function takes a string argument and verifies the content and length. The argument line is defined as a CHARACTER array, its length is passed as a second argument by value, and it is being used to dimension the length of the string.

FUNCTION strpass(line, b) BIND(C, name='foostr')
    CHARACTER(len=b) :: line
    INTEGER, VALUE :: b
    LOGICAL :: strpass

    strpass = (line == 'str_test') .AND. (b == 8)
END FUNCTION

The Java/JNA prototype looks like this:

 ...
 boolean foostr(String s, int len);
 ...

The application will need to pass in the string and obtain the actual string length.

...
String test = "str_test";
boolean result = lib.foostr(test, test.length());
assertTrue(result);
...

Modules

Modules can be used to place all subroutines/functions that should be used via JNA, its good practice. A module allows for global data, an module level IMPLICIT NONE. Again, it is recommended to use the BIND keyword since the compiler might alter the subroutine name in the DLL otherwise, since it is a different scope.

MODULE test

 IMPLICIT NONE

 CONTAINS
    
 SUBROUTINE ffunc(a, b) BIND(C,"reffunc")
    INTEGER :: a,b
    a = 3 
    b = 5
 END SUBROUTINE
 
END MODULE test

The example above the subroutine ffunc can still be called as reffunc from JNA/Java.

TYPE Arguments

Type arguments for functions can be handled too. This allows the passing of complex objects directly from Java to FORTRAN. Lets suppose you have the following FORTRAN code, that defines a TYPE for a City and a subroutine typepass that takes such an argument.

MODULE test

 IMPLICIT NONE

 TYPE :: City
    INTEGER  :: Population
    REAL(8)  :: Latitude, Longitude
    INTEGER  :: Elevation              
 END TYPE

 CONTAINS

 SUBROUTINE typepass(c) BIND(C, name='footype')
    TYPE(CITY) :: c

    c%Population = c%Population + 1000
    c%Latitude = c%Latitude + 5
    c%Longitude = c%Longitude + 5
    c%Elevation = c%Elevation + 9
 END SUBROUTINE

END MODULE test

Both the TYPE and the subroutine are placed in a module.

Now lets look at the JNA/Java counterpart that defines the interface for typepass:

import com.sun.jna.Library;
import com.sun.jna.Native;
import com.sun.jna.Structure;

public static class City extends Structure {

   public int Population;
   public double Latitude,  Longitude;
   public int Elevation;
}

interface F95Test extends Library {

   void footype(City c);
}

There is an Java class called City that must have the identical internal layout to its FORTRAN TYPE. The names, however, do not matter. It also has to be subclass of Structure which is defined in the JNA API.

Note that all fields of City have to be public to allow JNA to compute its size. The F95Test method again used the BIND name and the City argument.

An application will instantiate theCity object and pass it in as usual.

   ...
   City city = new City(3000, 0.222, 0.333, 1001);
   F95Test.lib.footype(city);

   assertEquals(4000, city.Population);
   assertEquals(5.222, city.Latitude, 0.0001);
   assertEquals(5.333, city.Longitude, 0.0001);
   assertEquals(1010, city.Elevation);
   ...

Pitfalls and Obstacles

  • Always be aware that FORTRAN subroutine/function arguments are passed by reference, unless the VALUE modifier is used. You might end up accessing memory that might cause a segfault. Therefor use always Native.setProtected(true) to provide for more memory protection in the JNA site, if supported for your architecture.
  • If JNA cannot find your function in a DLL and both names match in source, do not panic. You should explore the DLL to find out the real name in your DLL, since this is what JNA is looking at not the source. Do something like nm libF90Test.dll | grep reffunc if reffunc the the function you'd like to call. You'll see maybe a different (more underscores in the name, or a module name prefix) name depending on the compiler and compiler flasg. This is the name you should use in your Java interface. To make this more transparent use the BIND keyword in your source to ensure the proper name in the DLL.
  • If you pass Java objects to FORTRAN as TYPE, all Java fields have to be public. JNA will complain at runtime not being able to determine the size of the java object.

  • Be aware of the array ordering in FORTRAN that sees a two dimensional array always in COLUMN/ROW order. Also, you cannot pass a real multidimensional Java array to FORTRAN, since those do not have a continuous memory layout. On the Java side you always have to manage a one dimensional array that you reshape for FORTRAN by passing in its dimensions.
  • If a DLL cannot be found at runtime, you need to set the search path. You can set the system property jna.library.path to point to paths on your filesystem. You also use the NativeLibrary.addSearchPath method to add a map a directory to a specific DLL name.

Data type mapping

The following table shows equivalent data types between FORTRAN and Java, ehen passed by value

Fortran JNA/Java
INTEGER(Kind=8) int
INTEGER(Kind=4) short
REAL(Kind=4) float
REAL(Kind=8) double
LOGICAL boolean
Character byte
CHARACTER(len= ) String

C/C++ Interoperability

Array Arguments

Scalar Arguments

Type Mapping

Pitfalls

Setting up a Java project for development.

You can use any IDE to develop your JNA supported Java code, as long as you make the file jna.jar a part of your classpath. This is the only library you need! See the References section for download.

Deploying a model with native DLL

Using the NGMF JNA support Library

Dynamic Library Generation

The following sections will provide some help for managing the build process using different compilers. GNU's compiler collection and the G95 spin-off, as well as the Intel Compiler suite seem to be the most important tools for the general developer.

G95 G95 allows compiling and linking into a DLL. Note that G95 is not a part of the GNU compiler collection. To compile and link a FORTRAN source into a DLL use the following flags for GCC tools:

Compile a FORTRAN source into an object file:

g95 -fno-underscoring  -c -g -o build/ftest.o ftest.f90

Link the DLL:

g95 -Wl,--add-stdcall-alias -shared -o dist/libF90Dyn.dll build/ftest.o  

Note that you have to use G95 for linking too. This ensures for linking the right FORTRAN runtime libraries into your DLL

GNU GFortran

TBD

Intel ifc TBD

Best Practices

Application Lifecycle Management

Component Documentation and Repository Publication

Regression Testing

FORTRAN Coding Conventions

This document addresses coding conventions for OMS components and scientific code written in Java and the FORTRAN programming language.

The purpose of this document is to ensure that new Fortran code will be as portable and robust as possible, as well as consistent throughout the system. It builds upon commonly shared experience to avoid error-prone practices and gathers guidelines that are known to make codes more robust.

This document covers items in order of decreasing importance (see below), deemed to be important for any code. It is recognized in the spirit of this standard that certain suggestions which make code easier to read for some people (e.g. lining up attributes, or using all lower case or mixed case) are subjective and therefore should not have the same weight as techniques and practices that are known to improve code quality. For this reason, the standards within this document are divided into three components; Standards, Guidelines and Recommendations:

  • Required: Aimed at ensuring portability, readability and robustness. Compliance with this category is mandatory.
  • Recommended: Good practices. Compliance with this category is strongly encouraged. The case for deviations will need to be argued by the programmer.
  • Encouraged: Compliance with this category is optional, but is encouraged for consistency purposes.

Depending on the projects, programmer may opt to adhere to all three levels or just the two first. All projects must adhere at least to the mandatory standards.

Good Practices

These usually help in the robustness of the code (by checking interface compatibility for example) and in the readability, maintainability and portability. They are reminded here:

  • Encapsulation: Use of modules for procedures, functions, data.
  • Use Dynamic Memory allocation for optimal memory usage.
  • Derived types or structures which generally lead to stable interfaces, optimal memory usage, compactness, etc.
  • Optional and keyword arguments in using routines.
  • Functions/subroutines/operators overloading capability.
  • Intrinsic functions: bits, arrays manipulations, kinds definitions, etc.

Interoperability and Portability

Required

  • Source code must conform to the ISO Fortran95 standard.
  • No compiler- or platform-dependent extensions shall be used.
  • No use shall be made of compiler-dependent error specifier values (e.g. IOSTAT or STAT values).
  • Source code must compiled and run under gfortran that is part of the GNU Compiler Collection.

Recommended

  • Note that STOP is a F90/95 standard. EXIT(N) is an extension and should be avoided. It is recognized that STOP does not necessarily return an error code. If an error code must be passed to a script for instance, then the extension EXIT could be used but within a central place, so that to limit its occurrences within the code to a single place.
  • Precision: Parametrizations should not rely on vendor-supplied flags to supply a default floating point precision or integer size. The F90/95 KIND feature should be used instead.
  • Do not use tab characters in the code to ensure it will look as intended when ported. They are not part of the FORTRAN characters set.

Encouraged

  • For applications requiring interaction with independently-developed frameworks, the use of KIND type for all variables declaration is encouraged to facilitate the integration.

Readability

Required

  • Use free format syntax.
  • Use consistent indentation across the code. Each level of indentation should use at least two spaces.
  • Use modules to organize source code.
  • FORTRAN keywords (e.g., DATA) shall not be used as variable names.
  • Use meaningful, understandable names for variables and parameters. Recognized abbreviations are acceptable as a means of preventing variable names getting too long.
  • Each externally-called function, subroutine, should contain a header. The content and style of the header should be consistent across the system, and should include the functionality of the function, as well as the description of the arguments, the author(s) names. A header could be replaced by a limited number of descriptive comments for small subroutines.
  • Magic numbers should be avoided; physical constants (e.g., pi, gas constants) should never be hardwired into the executable portion of a code; use PARAMETER statements instead.
  • Hard-coded numbers should be avoided when passed through argument lists since a compiler flag, which defines a default precision for constants, cannot be guaranteed.

Recommended

  • Use construct names to name loops, to increase readability, especially in nested loops.
  • Similarly, use construct names in subroutines, functions, main programs, modules, operator, interface, etc.
  • Include comments to describe the input, output and local variables of all procedures. Grouping comments for similar variables is acceptable when their names are explicit enough.
  • Use comments as required to delineate significant functional sections of code.
  • Do not use FORTRAN statements and intrinsic function names as symbolic names.
  • Use named parameters instead of “magic numbers”; REAL, PARAMETER :: PI=3.14159, ONE=1.0
  • Do not use GOTO statements. These are hard to maintain and complicate understanding the code. If absolutely necessary to use GOTO (if using other constructs complicates the code structure), thoroughly document the use of the GOTO.

Encouraged

  • When writing new code, adhere to the style standards within your own coding style. When modifying an old code, adhere to the style of the existing code to keep consistency.
  • Use the same indentation for comments as for the rest of the code.
  • Functions, procedures, data that are naturally linked should be grouped in modules.
  • Limit to 80 the number of characters per line (maximum allowed under ISO is 132)
  • Use of operators <, >, <=, >=, ==, /= is encouraged (for readability) instead of .lt., .gt., .le., .ge., .eq., .ne.
  • Modules should be named the same name as the files they reside in: To simplify the makefiles that compile them. Consequently, multiple modules in a single file are to be avoided where possible.
  • Use blanks to improve the appearance of the code, to separate syntactic elements (on either side of equal signs, etc) in type declaration statements
  • Always use the :: notation, even if there are no attributes.
  • Line up vertically: attributes, variables, comments within the variables declaration section.
  • Remove unused variables
  • Remove code that was used for debugging once this is complete.

Robustness

Required

  • Use Implicit NONE in all codes: main programs, modules, etc. To ensure correct size and type declarations of variables/arrays.
  • Use PRIVATE in modules before explicitly listing data, functions, procedures to be PUBLIC. This ensures encapsulation of modules and avoids potential naming conflicts. Exception to previous statement is when a module is entirely dedicated to PUBLIC data/functions (e.g. a module dedicated to constants).
  • Initialize all variables. Do not assume machine default value assignments.
  • Do not initialize variables of one type with values of another.

Recommended

  • Do not use the operators == and /= with floating-point expressions as operands. Check instead the departure of the difference from a pre-defined numerical accuracy threshold (e.g. epsilon comparison).
  • In mixed mode expressions and assignments (where variables of different types are mixed), the type conversions should be written explicitly (not assumed). Do not compare expressions of different types for instance. Explicitly perform the type conversion first.
  • No include files should be used. Use modules instead, with USE statements in calling programs.
  • Structures (derived types) should be defined within their own module. Procedures, Functions to manipulate these structures should also be defined within this module, to form an object-like entity.
  • Procedures should be logically flat (should focus on a particular functionality, not several ones)
  • Module PUBLIC variables (global variables) should be used with care and mostly for static or infrequently varying data.

Encouraged

  • Use parentheses at all times to control evaluation order in expressions.
  • Use of structures is encouraged for a more stable interface and a more compact design. Refer to structure contents with the % sign (e.g. Absorbents%WaterVapor).

Arrays

Required

  • Subscript expressions should be of type integer only.
  • When arrays are passed as arguments, code should not assume any particular passing mechanism.

Recommended

  • Use of arrays is encouraged as well as intrinsinc functions to manipulate them.
  • Use of assumed shapes is fine in passing vectors/arrays to functions/arrays.

Encouraged

  • Declare DIMENSION for all non-scalars

Dynamic Memory Allocation / Pointers

Required

  • Use of allocatable arrays is preferred to using pointers, when possible. To minimize risks of memory leaks and heap fragmentation.
  • Use of pointers is allowed when declaring an array in a subroutine and making it available to a calling program.
  • Always initialize pointer variables in their declaration statement using the NULL() intrinsic. INTEGER, POINTER :: x=> NULL()
  • The preferable mechanism for dynamic memory allocation is automatic arrays, as opposed to ALLOCATABLE or POINTER arrays for which memory must be explicitly allocated and deallocated; space allocated using ALLOCATABLE or POINTER must be explicitly freed using the DEALLOCATE statement.

Recommended

  • Always deallocate allocated pointers and arrays. This is especially important inside subroutines and inside loops.
  • Always test the success of a dynamic memory allocation and deallocation - the ALLOCATE and DEALLOCATE statements have an optional argument to allow this.
  • In a given program unit do not repeatedly ALLOCATE space, DEALLOCATE it and then ALLOCATE a larger block of space - this will almost certainly generate large amounts of unusable memory.

Encouraged

  • Use of dynamic memory allocation is encouraged. It makes code generic and avoids declaring with maximum dimensions.
  • For simplicity, use Automatic arrays in subroutines whenever possible, instead of allocatable arrays.

Looping

Required
  • Do not use GOTO to exit/cycle loops, use instead EXIT or CYCLE statements.

Recommended

  • No numbered DO loops such as (DO 10 ...10 CONTINUE).

Functions/Procedures

Required

  • The SAVE statement is discouraged; use module variables for state saving.
  • Do not use an entry in a function subprogram.
  • Functions must not have pointer results.
  • The names of intrinsic functions (e.g., SUM) shall not be used for user-defined functions.
  • Procedures that return a single value should be functions; note that single values could also be user-defined types.
  • All communication with the module should be through the argument list or it should access its module variables.

Recommended

  • All dummy arguments, except pointers, should include the INTENT clause in their declaration
  • Limit use of type specific intrinsic functions (e.g., AMAX, DMAX - use MAX in all cases).
  • Avoid statically dimensioned array arguments in a function/subroutine.
  • Check for invalid argument values.

Encouraged

  • Error conditions. When an error condition occurs inside a function/procedure, a message describing what went wrong should be printed. The name of the routine in which the error occurred must be included. It is acceptable to terminate execution within a package, but the developer may instead wish to return an error flag through the argument list.
  • Functions/procedures that perform the same function but for different types/sizes of arguments, should be overloaded, to minimize duplication and ease the maintainability.
  • When explicit interfaces are needed, use modules, or contain the subroutines in the calling programs (through CONTAINS statement), for simplicity.
  • Do not use external routines as these need interface blocks that would need to be updated each time the interface of the external routine is changed.

I/O

Required
  • I/O statements on external files should contain the status specifier parameters err=, end=, iostat=, as appropriate.
  • All global variables, if present, should be set at the initialization stage.

Recommended

  • Avoid using NAMELIST I/O if possible.
  • Use write rather than print statements for non-terminal I/O.
  • Use Character parameters or explicit format specifiers inside the Read or Write statement. DO not use labeled format statements (outdated).

Fortran Features that are obsolescent and/or discouraged:

Required

  • No Common blocks. Modules are a better way to declare/store static data, with the added ability to mix data of various types, and to limit access to contained variables through use of the ONLY and PRIVATE clauses.
  • No assigned and computed GO TOs - use the CASE construct instead
  • No arithmetic IF statements - use the block IF construct instead
  • Use REAL instead of DOUBLE PRECISION
  • Avoid DATA, ASSIGN Labeled DO BACKSPACE Blank COMMON, BLOCK DATA
  • Branch to END IF outside the block IF
  • DO non-integer Control
  • Hollerith Constants
  • PAUSE
  • multple RETURN
  • Alternate RETURN

Recommended

  • Do not make use of the equivalence statement, especially for variables of different types. Use pointers or derived types instead.

Encouraged

  • No implicitly changing the shape of an array when passing it into a subroutine. Although actually forbidden in the standard it was very common practice in FORTRAN 77 to pass 'n' dimensional arrays into a subroutine where they would, say, be treated as a 1 dimensional array. This practice, though banned in Fortran 90, is still possible possiblewith external routines for which no Interface block has been supplied. This only works because of assumptions made about how the data is stored.

Source Files

Required
  • Document the function interface: argument name, type, unit, description, constraint,defaults.
  • The INCLUDE statement shall not be used; use the USE statement instead.

Recommended

  • Try to limit source column length, including comments, to 80 columns (or follow language specific limits).
  • A component should not exceed 300-500 effective lines of code, be efficient with your coding.
  • Use blank lines (or lines with a standard character in column 1) to separate statement blocks to improve code readability.
  • Apply consistent indentation method for code.
  • Module/subprogram names shall be lower case; the name of a file containing a module/subprogram shall be the module/subprogram name with the suffix *.f90."

Encouraged

  • Clearly separate declaration of argument variables from declaration of local variables.
  • Use descriptive and unique names for variables and subprograms (so as to improve the code readability and facilitate global string search);
  • try to limit name lengths to 12-15 characters.
  • Indent continuation lines to ensure that, for example, parts of a multi-line equation line up in a readable manner.
  • Start comment text with a standard character (e.g. !, C, etc.); if a stand-alone line then start comment character in the first column.

General Coding Guidelines

  • Reduce or eliminate global variable usage.
  • Attempt to limit the number of arguments in argument list - long lists make it hard to reuse.
  • Limit of only one return point per component.
  • Use exceptions as error indicators if supported.
  • Components should be specific to one and only one purpose.
  • Components with side effects are not allowed (e.g. don't mix I/O code with computational code).
  • Program against a standard (e.g., ANSI C, C++, Java, FORTRAN 77/90/95) -
  • Make sure your code compiles under different compilers and platforms.
  • Use preprocessor directives for adaptation to different architectures/compilers/OS.
  • Make I/O specific components seperate from computatonal components.
  • Avoid static allocation of data (compile time allocation).
  • Be most specific with your data types.
  • Avoid using custom data types for argument types.

Setting up a Build and Runtime Environment

Before you start devloping and using models, you need to setup your environment. Download the file ngmf.jar from this site.

If you use an IDE such as Netbeans, Eclipse, or IntelliJ, create a java project and add this file to your classpath. If you need to handle native code, the file jna.jar should be in the classpath too.

Command Line Interface

The simplest way to add a file to your classpath on compilation. This command line will compile all java files in the current directory and puts ngmf.jar into the classpath.

javac -classpath "/tmp/ngmf.jar" *.java

On runtime this command line is similar. model.Model is the class to execute using ngmf.jar

java -classpath "/tmp/ngmf.jar" model.Main

Ant

Netbeans

Eclipse

IntelliJ

References

Languages

Modeling Frameworks

Jar

CSV

Native

Bibliography

  • TBD