Creating OMS FORTRAN Science Components
Introduction
The OMS API and system components are written in Java. OMS currently supports science components written in Java, FORTRAN, C, C++, and NetLogo. FORTRAN modelers using OMS therefore produce models that are a combination of Java and FORTRAN components, which requires native language interoperability. This is accomplished using the Java Native Architecture (JNA), an open source library that emphasizes an easy integration of Dynamic Linkable Libraries (DLLs) into Java. JNA was developed for easy Java - C/C++ communication. JNA was designed to provide native access in a natural way with a minimum of effort, with no boilerplate or generated code required.
Modelers can create OMS science components directly in FORTRAN. The OMS build system automatically integrates the components into the model. Details are summarized in Native Fortran Components in OMS3.
Requirements for OMS FORTRAN Components
A FORTRAN science component can be made OMS compatible if the following requirements are satisfied:
- Component source code conforms to F90+ syntax and follows the required conventions described later in this reference.
- The component is written as a module, subroutine, or function
- As prescribed OMS annotations are added as comment lines to the component source code
- As prescribed component source code contains ISO_C_BINDING statement and FORTRAN to C data type mapping
- The source code is compiled with GCC 4.4+ / gfortran
Design Consideration
The modeling team must decide the granularity of the OMS FORTRAN components they develop, usually based on the scientific concepts involved. The component may be a module containing subroutines and functions. It may be a subroutine containing subroutines and functions, or simply a single subroutine or function.
Example
The following component source code example, in this case a single subroutine, complies with the requirements:
! Add following annotation comment lines, primarily for auto-documentation
! @Description ("WEPP hillslope erosion component")
! @Author (name="j.c. ascough ii and d.c. flanagan")
! @Keywords ("Erosion")
! @VersionInfo ("$Id: BasinSum.java 367 2009-08-28 22:21:52Z odavid $")
! @SourceInfo ("$HeadURL: http://svn.javaforge.com/svn/oms/branches/oms3.prj.prms2008/src/prms2008/BasinSum.java $")
! @License ("http://www.gnu.org/licenses/gpl-2.0.html")
! @Documentation ("BasinSm.xml")
! Add following annotation comment line to enable OMS execution
! of the FORTRAN component
! @Execute
SUBROUTINE wecall(eroout, eroout_len, eroplot, eroplot_len, runoff, peakro, effdrn, npart, spg, outd)
! Add following line to enable JNA binding to FORTRAN component
USE, INTRINSIC :: ISO_C_BINDING
IMPLICIT NONE
! Add annotation comment lines for following five input variables
! Also add FORTRAN to C data type mapping declarations
! @Description("Erosion output name")
! @In
CHARACTER(kind = C_CHAR, len = eroout_len) :: eroout
INTEGER(C_INT), VALUE :: eroout_len
! @Description("Erosion plot name")
! @In
CHARACTER(kind = C_CHAR, len = eroplot_len) :: eroplot
INTEGER(C_INT), VALUE :: eroplot_len
! @In
REAL(C_FLOAT) :: runoff, peakro, effdrn
! @In
INTEGER(C_INT) :: npart
! @In
REAL(C_FLOAT), DIMENSION(npart) :: spg
! Add annotation comment line for following output variable
! @Out
REAL(C_FLOAT) :: outd
print *, runoff, peakro, effdrn
print *, npart
print *, spg
print *, eroout
print *, eroplot
print *, 'done...... '
outd = outd + 4.36
END SUBROUTINE
Adding Annotations and Source Code Modifications for OMS Compatibility (EXERCISE 3)
In the component example above, the following steps were taken to add annotations and modify the source code of the FORTRAN subroutine. (Note: this assumes that the FORTRAN code has been made F95 compliant).
- At the top add annotations as comment lines for description, author, keywords, version information, source information, license, and documentation.
- Add the @Execute annotation comment line before the subroutine statement
- Add "USE, INTRINSIC :: ISO_C_BINDING statement
- Add description annotation comment line and @In comment line before declaring the eroout and eroout_len variables
- Add FORTRAN to C data type mapping for the eroout and eroout_len variables
- Repeat for the remaining input and output variables. (Note: previous corresponding input and output variable declarations would need to be commented out).
FORTRAN Coding Conventions for OMS
This section provides guidance for coding FORTRAN science components for model development using OMS. The conventions are listed in decreasing importance. Many are good practices with any language. They are divided into three categories: required, recommended, and encouraged:
Required | Aimed at ensuring portability, readability and robustness. |
Recommended | Good practices; there should be strong reasons for not adopting the convention. |
Encouraged | Compliance with this category is optional, but is encouraged for consistency purposes. |
General Good Practices
These usually help in the robustness of the code (by checking interface compatibility for example) and in the readability, maintainability and portability.
- Encapsulation: Use of modules for procedures, functions, data.
- Use Dynamic Memory allocation for optimal memory usage.
- Derived types or structures which generally lead to stable interfaces, optimal memory usage, compactness, etc.
- Optional and keyword arguments in using routines.
- Functions/subroutines/operators overloading capability.
- Intrinsic functions: bits, arrays manipulations, kinds definitions, etc.
Interoperability and Portability
Required
- Source code must conform to the ISO FORTRAN 95 standard.
- No use shall be made of compiler-dependent error specifier values (e.g. IOSTAT or STAT values).
- No compiler- or platform-dependent extensions shall be used.
- Source code must compiled and run under gfortran that is part of the GNU Compiler Collection.
Readability
Required
- Use free format syntax
- Use consistent indentation across the code. Each level of indentation should use at least two spaces.
- Use modules to organize source code.
- FORTRAN keywords (e.g., DATA) shall not be used as variable names.
- Use meaningful, understandable names for variables and parameters. Recognized abbreviations are acceptable as a means of preventing variable names getting too long.
- Each externally-called function, subroutine, should contain a header. The content and style of the header should be consistent across the system, and should include the functionality of the function, as well as the description of the arguments, the author(s) names. A header could be replaced by a limited number of descriptive comments for small subroutines.
- Magic numbers should be avoided; physical constants (e.g., pi, gas constants) should never be hardwired into the executable portion of a code; use PARAMETER statements instead.
- Hard-coded numbers should be avoided when passed through argument lists since a compiler flag, which defines a default precision for constants, cannot be guaranteed.
Robustness
Required
- Use Implicit NONE in all codes: main programs, modules, etc. To ensure correct size and type declarations of variables/arrays.
- Use PRIVATE in modules before explicitly listing data, functions, procedures to be PUBLIC. This ensures encapsulation of modules and avoids potential naming conflicts. Exception to previous statement is when a module is entirely dedicated to PUBLIC data/functions (e.g. a module dedicated to constants).
- Initialize all variables. Do not assume machine default value assignments.
- Do not initialize variables of one type with values of another.
Recommended
- Do not use the operators == and /= with floating-point expressions as operands. Check instead the departure of the difference from a pre-defined numerical accuracy threshold (e.g. epsilon comparison).
- In mixed mode expressions and assignments (where variables of different types are mixed), the type conversions should be written explicitly (not assumed). Do not compare expressions of different types for instance. Explicitly perform the type conversion first.
- No include files should be used. Use modules instead, with USE statements in calling programs.
- Structures (derived types) should be defined within their own module. Procedures, Functions to manipulate these structures should also be defined within this module, to form an object-like entity.
- Procedures should be logically flat (should focus on a particular functionality, not several ones)
- Module PUBLIC variables (global variables) should be used with care and mostly for static or infrequently varying data.
Encouraged
- Use parentheses at all times to control evaluation order in expressions.
- Use of structures is encouraged for a more stable interface and a more compact design. Refer to structure contents with the % sign (e.g. Absorbents%WaterVapor).
Arrays
Required
- Subscript expressions should be of type integer only.
- When arrays are passed as arguments, code should not assume any particular passing mechanism.
Recommended
- Use of arrays is encouraged as well as intrinsic functions to manipulate them.
- Use of assumed shapes is fine in passing vectors/arrays to functions/arrays.
Encouraged
- Declare DIMENSION for all non-scalars
Dynamic Memory Allocation / Pointers
Required
- Use of allocatable arrays is preferred to using pointers, when possible. To minimize risks of memory leaks and heap fragmentation.
- Use of pointers is allowed when declaring an array in a subroutine and making it available to a calling program.
- Always initialize pointer variables in their declaration statement using the NULL() intrinsic. INTEGER, POINTER :: x=> NULL()
- The preferable mechanism for dynamic memory allocation is automatic arrays, as opposed to ALLOCATABLE or POINTER arrays for which memory must be explicitly allocated and deallocated; space allocated using ALLOCATABLE or POINTER must be explicitly freed using the DEALLOCATE statement.
Recommended
- Always deallocate allocated pointers and arrays. This is especially important inside subroutines and inside loops.
- Always test the success of a dynamic memory allocation and deallocation - the ALLOCATE and DEALLOCATE statements have an optional argument to allow this.
- In a given program unit do not repeatedly ALLOCATE space, DEALLOCATE it and then ALLOCATE a larger block of space - this will almost certainly generate large amounts of unusable memory.
Encouraged
- Use of dynamic memory allocation is encouraged. It makes code generic and avoids declaring with maximum dimensions.
- For simplicity, use Automatic arrays in subroutines whenever possible, instead of allocatable arrays.
Looping
Required
- Do not use GOTO to exit/cycle loops, use instead EXIT or CYCLE statements.
Recommended
- No numbered DO loops such as (DO 10 ...10 CONTINUE).
Functions/Procedures
Required
- The SAVE statement is discouraged; use module variables for state saving.
- Do not use an entry in a function subprogram.
- Functions must not have pointer results.
- The names of intrinsic functions (e.g., SUM) shall not be used for user-defined functions.
- Procedures that return a single value should be functions; note that single values could also be user-defined types.
- All communication with the module should be through the argument list or it should access its module variables.
Recommended
- All dummy arguments, except pointers, should include the INTENT clause in their declaration
- Limit use of type specific intrinsic functions (e.g., AMAX, DMAX - use MAX in all cases).
- Avoid statically dimensioned array arguments in a function/subroutine.
- Check for invalid argument values.
Encouraged
- Error conditions. When an error condition occurs inside a function/procedure, a message describing what went wrong should be printed. The name of the routine in which the error occurred must be included. It is acceptable to terminate execution within a package, but the developer may instead wish to return an error flag through the argument list.
- Functions/procedures that perform the same function but for different types/sizes of arguments, should be overloaded, to minimize duplication and ease the maintainability.
- When explicit interfaces are needed, use modules, or contain the subroutines in the calling programs (through CONTAINS statement), for simplicity.
- Do not use external routines as these need interface blocks that would need to be updated each time the interface of the external routine is changed.
I/O
Required
- I/O statements on external files should contain the status specifier parameters err=, end=, iostat=, as appropriate.
- All global variables, if present, should be set at the initialization stage.
Recommended
- Avoid using NAMELIST I/O if possible.
- Use write rather than print statements for non-terminal I/O.
- Use Character parameters or explicit format specifiers inside the Read or Write statement. DO not use labeled format statements (outdated).
FORTRAN Features that are obsolescent and/or discouraged
Required
- No Common blocks. Modules are a better way to declare/store static data, with the added ability to mix data of various types, and to limit access to contained variables through use of the ONLY and PRIVATE clauses.
- No assigned and computed GO TOs - use the CASE construct instead
- No arithmetic IF statements - use the block IF construct instead
- Avoid DATA, ASSIGN Labeled DO BACKSPACE Blank COMMON, BLOCK DATA
- Use REAL instead of DOUBLE PRECISION
- Branch to END IF outside the block IF
- DO non-integer Control
- Hollerith Constants
- PAUSE
- multiple RETURN
- Alternate RETURN
Recommended
- Do not make use of the equivalence statement, especially for variables of different types. Use pointers or derived types instead.
Encouraged
- No implicitly changing the shape of an array when passing it into a subroutine. Although actually forbidden in the standard it was very common practice in FORTRAN 77 to pass 'n' dimensional arrays into a subroutine where they would, say, be treated as a 1 dimensional array. This practice, though banned in FORTRAN 90, is still possible possible with external routines for which no Interface block has been supplied. This only works because of assumptions made about how the data is stored.
Source Files
Required
- Document the function interface: argument name, type, unit, description, constraint,defaults.
- The INCLUDE statement shall not be used; use the USE statement instead.
Recommended
- Try to limit source column length, including comments, to 80 columns (or follow language specific limits).
- A component should not exceed 300-500 effective lines of code, be efficient with your coding.
- Use blank lines (or lines with a standard character in column 1) to separate statement blocks to improve code readability.
- Apply consistent indentation method for code.
- Module/subprogram names shall be lower case; the name of a file containing a module/subprogram shall be the module/subprogram name with the suffix *.f90."
Encouraged
- Clearly separate declaration of argument variables from declaration of local variables.
- Use descriptive and unique names for variables and subprograms (so as to improve the code readability and facilitate global string search)
- Try to limit name lengths to 12-15 characters.
- Indent continuation lines to ensure that, for example, parts of a multi-line equation line up in a readable manner.
- Start comment text with a standard character (e.g. !, C, etc.); if a stand-alone line then start comment character in the first column.
General Coding Guidelines
- Reduce or eliminate global variable usage.
- Attempt to limit the number of arguments in argument list - long lists make it hard to reuse.
- Limit of only one return point per component.
- Use exceptions as error indicators if supported.
- Components should be specific to one and only one purpose.
- Components with side effects are not allowed (e.g. Don't mix I/O code with computational code).
- Program against a standard (e.g., ANSI C, C++, Java, FORTRAN 77/90/95) -
- Make sure your code compiles under different compilers and platforms.
- Use preprocessor directives for adaptation to different architectures/compilers/OS.
- Make I/O specific components separate from computational components.
- Avoid static allocation of data (compile time allocation).
- Be most specific with your data types.
- Avoid using custom data types for argument types.
|
|