![]() |
ICON Community Interface 0.4.0
|
The Community Interface (ComIn) organizes the data exchange and simulation events between the ICON model and "3rd party modules". The concept can be logically divided into an Adapter Library and a Callback Register.
Code contributions from different researchers and institutions ("third-party code") are usually not included in the main ICON code, but remain confined to project branches. In any case, they add specific switches and calls to ICON's main loop, making the model code less readable. Additional maintenance is required to keep the third-party code compatible with new versions of ICON. These problems are solved by providing a unified plugin interface. While the core model remains unchanged, third-party code can be run alongside ICON, even if it is implemented in a programming language other than Fortran.
Clarification of terms I: In this document, we use the phrase "3rd party module" and the term "plugin" interchangeably.
Clarification of terms II: There is a fundamental difference between this community interface and a coupling software, e.g. YAC: A coupler technically moves the data between interacting component models. However, this does not solve the question of how to add this interaction in a non-intrusive way to ICON. This is the purpose of the community interface, which exposes ICON's data structures in an organized way and controls what, how and when foreign functions are called and data is exchanged. The concept of a community interface and the coupling software complement each other: One may think of the coupling software as the technical sub-layer when, for example, interpolation or parallel communication is required.
The adapter library allows the 3rd party module(s) to be built separately from the ICON model.
The adapter library is implemented in Fortran 2003, but interfaces are provided for plugins that are written in C/C++ and Python.
In order to support this language interoperability, the BIND(C) attribute is required for some publicly accessible Fortran data structures. This also implies that all public procedures of the ComIn have to be non-type-bound, because calling type-bound procedures via ISO-bindings is not supported by the Fortran standard. Some internal types containing POINTERs, ALLOCATABLEs or CHARACTERs are not defined with the BIND(C) attribute. Instead, access functions to the components of these data type are provided.
Note that combining ICON ComIn with a coupler software already offers another technical solution for language interoperability: Through an adapter, the ComIn mechanism can be used to feed an externally running receiver process with ICON data. This software may be written in C.
REAL(dp), REAL(sp) or INTEGER arrays only. Fields defined on native cells, vertices or edges can be accessed but arrays related to interpolated latitude-longitude grids are not exposed to the adapter library.Not yet implemented in the current version of ICON ComIn:
The ComIn library uses semantic versioning (https://semver.org), which encodes a version by a three-part version number (Major.Minor.Patch). As a convention, the major version has to match between ComIn, the ICON model, and the 3rd party modules for correct interaction. The minor version should be backward compatible.
Example: A 3rd party module using ComIn v1.1 capabilities should still work with ComIn v1.2.
As many components of the development are still in the testing phase, the initial public release is set to version number 0.1.0.
Both, the 3rd party modules and the host model (ICON), are built independently and may be related to different versions of ICON ComIn. ComIn uses the SONAME to ensure that the library that is loaded at runtime is compatible with the version, that was used at compile time. E.g. libcomin.so.1 is the library name of the ComIn version 1.x.y. When loading a 3rd party module, it is explicitly checked that the major versions of the ComIn library that is used by the host model and the one that is used by the 3rd party library match.
The user can obtain the version information for the ComIn library that is used at runtime (loaded by the host model at startup) by calling the function comin_setup_get_version(), which returns an object of type t_comin_setup_version_info:
Data structures for the transfer of data between the host and the plugin are allocated once by the host. A pointer to these data structures is propagated to each of the plugins. This approach fails if the plugin assumes a different structure for the data due to a different ComIn version at build time. This means: Changes in the ComIn data state result in ABI incompatibility, and therefore this requires a change of the major version number.
Of course, there is also the issue of API incompatibility, where functions and/or interfaces change. These also result in a change of the major revision number.
Note to ComIn developers: Introducing new global module variables with the intention of transferring data between the host and the plugin outside of the ComIn state module may corrupt the above mechanism! Therefore global module variables should be generally avoided.
As a replacement for a namespace functionality, which is not available in the Fortran programming language, ComIn uses the prefix comin_* for all modules and public entities (internally and externally public). The other part of the name follows this naming convention:
comin_<scope>_<method>comin_<scope>_<description>t_comin_<scope>_<description>.The naming element <scope> classifies the general context the object is used for. List of scopes (non-exhaustive): setup, current, callback, parallel, descrdata, var, errhandler. The meaning of these different scopes will become clear from the descriptions below.
Although ComIn is designed specifically for ICON, the code should remain agnostic of the host model and the attached third party plugins. Thus, the driving host model (ICON) is simply referred to by host in the code and the third party plugins are referred to by plugin.
USE and PUBLIC statements. The convention is that from the ICON side, no other module than comin_host_interface must be used.USE and PUBLIC statements. The convention is that from the third party plugin side, no other module than comin_plugin_interface must be used.By default ComIn calls comin_errhandler::comin_plugin_finish on error, which is an exposed ICON subroutine (reverse callback): A function pointer to ICON's finish routine. It is initialized by the host model through the subroutine comin_setup_errhandler(). This setting is mandatory and must be done before calling the comin_setup_check().
C and Fortran plugins have the option to manage errors independently by setting comin_errhandler::comin_error_set_errors_return to .TRUE.. If this is the case, ComIn will not automatically call comin_errhandler::comin_plugin_finish. Instead, the API call will return. The plugin will then need to verify every execution's success by obtaining the error code with comin_errhandler::comin_error_get and comparing it to COMIN_SUCCESS. The relevant error message can be queried with comin_errhandler::comin_error_get_message. After checking the error the error state must be reset with comin_errhandler::comin_error_reset.
In Python errors are translated automatically into Python exceptions comin.ComInError, that can be handled with the try-statement. If an exception is not caught, the plugin_adapter calls comin_errhandler::comin_plugin_finish and prints the traceback for the exception.
Access period: Access to model variables is possible in the 3rd party module's secondary constructor (see below) and all subsequent subroutine callbacks. The contents of the model variables might change between callbacks.
The following information is required to describe (and uniquely identify) a model variable in ICON:
Encapsulation of this information into a (constant) data structure of the data type t_comin_var_descriptor is necessary for two reasons: a) iterating over the list of available variable is simplified, and b) future extensions, e.g. to lat-lon variables, are possible without changing 3rd party code. Descriptors must be created by calling the constructor t_comin_var_descriptor(name, id) to allow for these future extensions and to bring internal fields into a consistent state.
Remarks:
add_ref in ICON. Unique names are required here, while information about a variable's nature can be retrieved from its metadata (see section Metadata).t_comin_var_descriptor denotes an ICON variable and does not contain information about a specific 3rd party module. As a consequence, new variables that are added to ICON have to be unique, and this also applies in the case of multiple active plugins. Conflicting variables between different modules can result in a runtime abort (more details are described below in the section Creating additional model variables).The variable descriptor is stored alongside with the C pointer, the device_ptr, metadata required to convert the C pointer into a Fortran pointer and additional metadata in a data structure of the internal type t_comin_var_item:
The list of available (model) variables is managed in an internal data structure of the adapter library (variable list comin_var_list in t_comin_state). Note that the derived data type t_comin_var_item is not exposed to the host model or the plugins. ComIn plugins, for example, can access the data members via the subroutines comin_var_get_descr_list_head(), comin_var_get(), and comin_metadata_get(), see Iteration and Metadata.
The ICON model (host code) accesses the list of exposed variables with procedures for adding variables, and for removing the entire variable list, freeing the memory.
Access to ICON data fields happens via an accessor function comin_var_get. This subroutine is intended to be called in the secondary constructor of the 3rd party module (see Secondary constructor). It may not be called at an earlier or later time, and it serves the purpose of associating internal variable pointers of the 3rd party module to the ICON internal memory.
In the Fortran interface, variables of the datatypes REAL(dp), REAL(sp) or INTEGER can be accessed. For every datatype a corresponding variable type identifier exists: COMIN_VAR_DATATYPE_DOUBLE, COMIN_VAR_DATATYPE_FLOAT and COMIN_VAR_DATATYPE_INT. Internally, in ComIn the pointer is stored as a C pointer. Upon request, when calling handleget_ptr, the C pointer is converted to a Fortran pointer of the respective type. This indirection allows to switch between "old" and "new" time levels: The distinction between "old" and "new" (nnow, nnew) states, which is available in ICON for some data fields, is not exposed to the adapter library. Instead, for these fields the exposed pointers are always associated with the latest modified state. To access an "old" time level, 3rd party modules should allocate local buffers. Basically, comin_var_get(context, var_descriptor, flag, var_pointer) returns a handle of type t_comin_var_handle (e.g., handle) to access a 5-dimensional pointer via a type-bound procedure handleget_ptr (more on the dimension at the end of this subsection). The user needs to pass the correct var_pointer type to handlevar_get(var_pointer). To determine the variable's datatype the metadata "datatype" can be examined, which can be of the values COMIN_VAR_DATATYPE_DOUBLE, COMIN_VAR_DATATYPE_FLOAT or COMIN_VAR_DATATYPE_INT. The metadata is accessed by the type-bound procedure (function)
In the C interface, variables of type double, float and integer are shared, with the corresponding ComIn data types COMIN_VAR_DATATYPE_DOUBLE, COMIN_VAR_DATATYPE_FLOAT or COMIN_VAR_DATATYPE_INT. Variable data is obtained in two-step approach. Fristly, an opaque handle (t_comin_var_handle*) is returned from the function comin_var_get. Secondly, the required information can be accessed via accessor functions e.g. for the data (comin_var_get_ptr_double and variants), the datatype (comin_var_get_type), the dimensions (comin_var_get_shape).
The Python function comin.var_get also returns a handle to the variable data. To access this data the handle object exposes the Buffer Protocol, which makes it possible to access the underlying data, for example, with numpy.asarray function.
Remark: To ensure that you operate on the correct time slice of the variable, the function comin_var_get_ptr_double (and variants, C interface) should be called in the entrypoint callback where it is used. Similarly, in Python, the handle needs to be converted to a numpy array in every callback. Do not store the data pointer or numpy array in global memory and access it from multiple entrypoint callbacks!
Remark: Using to_xarray() converts Comin variables to dimension-labeled xarray DataArrays with preserved metadata, enabling intuitive analysis and visualization without needing to manually track dimension semantics. This simplifies workflows for diagnostics and post-processing while maintaining compatibility with the broader scientific Python ecosystem.
Remark (array blocking). In ICON, for reasons of cache efficiency nearly all DO loops over grid cells, edges, and vertices are organized in two nested loops: "jb loops" and "jc loops". Often, the outer loop jb is parallelized with OpenMP. With respect to the data layout, this means that arrays are split into several chunks of a much smaller length nproma. This array blocking is exposed via ComIn.
The index ordering of the dimensions of the variable is defined within the ICON model and may change between different versions of the community interface. The interpretation of the different array dimensions can be done using a special array dim_semantics(5) which is accessible via a procedure which is part of the type t_comin_var_handle. It specifies the ordering of the dimensional indices with the help of the following integer constants: COMIN_DIM_SEMANTICS_NPROMA, COMIN_DIM_SEMANTICS_BLOCK,COMIN_DIM_SEMANTICS_UNBLOCK,COMIN_DIM_SEMANTICS_LEVEL,COMIN_DIM_SEMANTICS_CONTAINER,COMIN_DIM_SEMANTICS_OTHER,COMIN_DIM_SEMANTICS_UNUSED. For tracer fields the constant COMIN_DIM_SEMANTICS_CONTAINER indicates the position of the tracer slice dimensions (see below). Note that the index positions are translated to 0-based indexing for the C/C++ and the Python interfaces of ComIn.
Convenience function, i.e. type-bound procedure handleto_3d for accessing 2D/3D fields. In practice, access to fields can be simplified, under the condition that the sequence of dimensions is (jc,jk,jb). This exact dimension sequence is (currently) fulfilled by the ICON model. In this case, a 3D pointer variable REAL(wp) :: slice(:,:,:) can be generated directly from a variable type-bound procedure of the type TYPE(t_comin_var_handle)
where the * stand for the respective type REAL(dp), REAL(sp) or INTEGER.
The Python interface implements this as a field property):
A similar function is available for C/C++:
Here, the additional restriction holds that array slices for (jc,jk,jb) have to be stored contiguously in memory, because only in this case can these variables be expressed by a simple base pointer.
An ICON model variable is always requested within a context, i.e. an entry point where the model variable is accessed (named integer constant, see the section "Entry points" below). The accessor function comin_var_get accepts a list of (possibly) multiple entry points: INTEGER, INTENT(IN) :: context(:)
Code example:
Important note: The subroutine comin_var_get is called in the secondary constructor, but cannot use EP_SECONDARY_CONSTRUCTOR in the context argument. The access for the context EP_SECONDARY_CONSTRUCTOR is excluded for this subroutine, since the variables of the host model do not have to be formally assigned with meaningful values at the time of execution of the secondary constructor.
The optional argument flag provides information w.r.t. the data flow. Flags may be combined like flag = IOR(COMIN_FLAG_READ, COMIN_FLAG_WRITE). Technically, this can be realized as follows:
COMIN_FLAG_SYNC_HALO can be used in combination with COMIN_FLAG_READ and COMIN_FLAG_WRITE to halo-synchronize the corresponding ComIn pointer. Please note that synchronization is currently supported only for cell-based variables.This flag triggers the generic interface sync_patch_array in ICON. For development purposes, it is also possible to set this flag in your plugin and run the plugin using the replay tool as a host model. In this scenario, we provide the exchange map for the corresponding domain using the YAXT library and perform the halo exchange based on this redistribution pattern. (Note that you need to set the cmake option COMIN_ENABLE_YAXT for that feature of the replay tool.)COMIN_FLAG_DEVICE indicates that the plugin will access the device pointer. If this flag is passed, comin will not synchronize the data from the device to the host before the callbacks are called.In some modules of the ICON code, e.g. the tracer module, there is a need for handling multiple variables at once, located in contiguous storage. These are called model variable containers. For tracer fields (or possibly other container variables) the procedure of the handle of type t_comin_var_handle returns an array pointer to the slice of the container in which the tracer lives.
Alternatively, in order to access the container array itself, the subroutine comin_var_get may be called directly for the container variable "tracer". More precisely, the Fortran language API returns a handle of type TYPE(t_comin_var_handle). This provides access by type-bound procedures for individual tracer fields, e.g. for qv, the slice index ncontained, corresponding to the tracer's position in the container array is accessed via handlencontained(). For the container array, indicated by the logical flag handlelcontainer()=.TRUE., the tracer's slice equals tracerptr(:,:,:,qvncontained,:). The position of the slice index dimension is indicated by the presence of COMIN_DIM_SEMANTICS_CONTAINER constant in the tracerdim_semantics array. Note that tracer variables in ICON have multiple time levels.
By setting the logical metadata switches tracer_turb and tracer_conv (see section on metadata for more information), tracers requested by a plugin can be added to the calculation of turbulent or convective transport tendencies. Please note the following remarks:
inwp_turb=1/inwp_convection = 1. There are no checks done for this.comin_request_add_var with tracer_turb=.TRUE. and/or tracer_conv=.TRUE. requests an add_var of a variable for the respective tendency in addition. Pointers to these additional tendency variables can be accessed by plugins like any other variable. The naming conventions are ddt_<tracername>_turb and ddt_<tracername>_conv. Please note that in ICON these tendency variables are stored in containers. As a tracer is not necessarily subject to convective or turbulent transport, the indexing of the different containers might differ.inwp_turb=1 is chosen as zero surface value. This results a maximum flux towards the surface. A flux limiter might be appropriate.It can be dangerous for a third party module to request an ICON field which is only diagnosed at output time steps. A known example is the mean sea level pressure pres_msl. If such a field is used as input for additional computations, results will depend on the output frequency specified in the ICON namelist. Registering such a field for restart in order to save its state is of course possible, but will not solve the problem. Currently, the only solution would be to manually set the fields' update frequency in the ICON code. Even worse, there is no metadata flag by which the third party module could check if the requested field is such a problematic 'output-only' field.
Variable descriptors can be explicitly specified, but there is also the possibility to iterate over a linked list of exposed ICON variables. Note that this mechanism is not related to the structure of ICON's internal variable lists. The linked list is implemented in the module comin_variables as an opaque list type with elements of TYPE(t_comin_var_descriptor). It is encapsulated by API functions comin_var_get_descr_list_head(), comin_var_get_descr_list_next(), etc. The iteration can be done by starting at the list's head, iterating to the linked list items, and accessing the variable through the t_comin_var_descriptor as defined above.
Code example:
Remark: The descriptor list iterator is implicitly deallocated by the comin_var_get_descr_list_next() function (when reaching the end of the list). However, for the sake of completeness, an explicit destructor function comin_var_descr_list_iterator_delete() is also provided. This happens for the Fortran and C API only, because the Python API hides the list iterator behind a "Pythonic" list interface.
ICON's physical parameterizations require a multitude of input parameters and variables. The latter are often derived from a different parameterization or from an external data set. For some applications, a ComIn plugin could alternatively provide these input variables. This is currently implemented in ICON for:
Aerosol input to ecRad radiation: Choosing the option irad_aero=3(externally specified aerosol) in ICON's &radiation_nml adds four new 4D-fields to ICON's global memory: optical depth long wave (od_lw), optical depth short wave (od_sw), single scattering albedo short wave (ssa_sw) and asymmetry parameter short wave (g_sw) which can be requested and specified by a ComIn plugin. Note that the fourth dimension nbands of these variables depends on the choice for the ecRad gas optics (ecrad_igas_model in ICON's &radiation_nml).
Gas input to ecRad radiation: The input source of gaseous concentrations can be chosen with the ICON namelist parameters irad_h2o for water vapor, irad_o3 for ozone, irad_co2 for carbon dioxide, irad_n2o for nitrous oxide, irad_ch4 for methane, irad_o2 for oxygen, irad_cfc11 for trichlorofluoromethane and irad_cfc12 for dichlorodifluoromethane in the radiation_nml. For all of these options, choosing -1 (e.g., irad_h2o=-1) allows for an external specification of the gaseous concentrations. This option has been implemented with ComIn applications in mind. Choosing this option, a variable <gas>rad_ext (e.g., h2orad_ext) is created by ICON which can be accessed and filled with values by a ComIn plugin. There is no cross-check that the arrays contain meaningful values.
A list of to-be-created variables is built by the primary constructor of the 3rd party module (see below) and made known to the ICON model via the adapter library function comin_var_request_add(). The add_var and add_ref functions from the ICON model are not directly exposed.
Remarks:
nlev or nlev+1 levels, and be cell, edge or vertex-based. The index ordering may change between different versions of the community interface.comin_var_get (a return value var_pointer /= NULL means "success").comin_var_get as described above. In other words: On the side of the plugins it is to be noted that by the execution of the procedure comin_var_request_add not yet immediately a variable is created, which can be used afterwards directly by the plugin. Instead this step represents only the registration of a new variable, which must be queried - like the remaining variables - with the function comin_var_get.-1 (meaning all domains) as part of the var_descriptor for variables with tracer = .true..The syntax for requesting a new variable is
When the requests for add_var/add_ref are processed by the ICON host code, a consistency check is performed which handles conflicts with existing model variables.
tracer flag (part of the metadata). Apart from that aspect it is not possible to create additional variable containers via the adapter library. It cannot be assumed (if only because of the "sharing" of variables between multiple ComIn plugins) that the tracers generated by a module are stored consecutively.lmodexclusive: the model aborts if the variable exists and is either requested exclusively in this call or was requested exclusively before. Otherwise a new variable, with the properties provided, is added to the list of requested variables.The restriction of the restart registration to newly created variables has been a deliberate design decision which greatly simplifies the interplay between ComIn and the ICON code. If the ComIn allowed to change the restart flag of existing variables in ICON, this would require additional code in ICON which performs this flag overriding at an appropriate place in ICON's initialization procedure. Besides, overriding the restart flag could be confusing for ICON developers due to its "magic behind the scenes" controlled by the ComIn. On the other hand, a workaround for adding existing variables to the restart could be implemented entirely on the 3rd party side by adding a custom restart-capable variable and attaching two additional routines after the restart read-in and before the restart write-out which handle the copy in/out.
Loops over cells in the 3rd party module can be organized using an auxiliary function comin_descrdata_get_cell_indices() which replicates the behavior of its ICON model counterpart.
Code example:
where jg denotes the logical domain ID and the loop covers the range from the start to the end of a block, i.e. from i_startblk to i_endblk in a jb loop in ICON. The indices is and ie in turn are associated with the block index (jc loop) and are the return values of this routine. The other two parameters, grf_bdywidth_c and min_rlcell_in further specify the refin_ctrl level where the do loop starts and ends. They take into account the local indexing after blocking and domain decmposition and their range is visualized in Fig. 9.2 of the 2024 ICON Model Tutorial. Section 9.1 of this tutorial provides an overview of the parameters and ideas presented here and also introduces the get_indices_c routine, after which comin_descrdata_get_cell_indices is modelled.
is and ie are return values of the routine denoting the start index and end index, respectively. The input variables i_startblk, i_endblk, grf_bdywidth_c and min_rlcell_int are member variables of the data type t_comin_descrdata_global (see also the section on global data).
An example application of the ICON routine get_indices_c is for example given in the ICON routine nwp_nh_interface.
Similarly, for edges and vertices the auxiliary functions comin_descrdata_get_edge_indices() (modelled after get_indices_e) and comin_descrdata_get_vert_indices() (modelled after get_indices_v) are available.
Metadata information can be set when requesting additional variables and retrieved for existing and newly created model variables. The instructions start with introducing which metadata is set by default and how to retrieve it before providing some details on how to set new metadata when requesting additional variables. Finally, we introduce a method to iterate through all metadata which were set for a certain variable.
Metadata are provided read-only to the 3rd party plugins. They are available from the secondary constructor and do not change over runtime. Examples for information provided as variable metadata are information about if the variable is a tracer, or if it is a restart variable. Note that some metadata is tracer-specific and therefore prepended by tracer_. Note that for tendency variables (like tendency due to turbulence), the metadata tracer_turb and tracer_conv are not set.
Currently the metadata information for zaxis_id is incomplete. The interpretation of fields with the property COMIN_ZAXIS_3D and COMIN_ZAXIS_3D_HALF is already possible (includes all fields described by ZA_REFERENCE, ZA_REFERENCE_HALF and ZA_REFERENCE_HALF_HHL in ICON), and also ICON's ZA_SURFACE fields (surface or other 2D fields like 10 m wind) are described by the property COMIN_ZAXIS_2D. All other vertical axis types are grouped under COMIN_ZAXIS_UNDEF. This includes information about soil layers. In a future release, the list of zaxis_id options will be expanded to more accurately describe the underlying data. For now, COMIN_DIM_SEMANTICS_LEVEL which is part of dim_semantics will indicate the position of the vertical dimension in the dimension array and can be used to determine the vertical axis and its size.
| metadata | data type | description | default |
|---|---|---|---|
zaxis_id | INTEGER | gives an interpretation of the vertical axis (2D = COMIN_ZAXIS_2D, atmospheric levels = COMIN_ZAXIS_3D, ...) | COMIN_ZAXIS_3D |
hgrid_id | INTEGER | gives an interpretation of the horizontal axis (options are COMIN_HGRID_UNSTRUCTURED_CELL, COMIN_HGRID_UNSTRUCTURED_EDGE and COMIN_HGRID_UNSTRUCTURED_VERTEX corresponding to unstructured cell, edges and vertices) | GRID_UNSTRUCTURED_CELL |
restart | LOGICAL | Flag. TRUE, if this is a restart variable | .FALSE. |
datatype | INTEGER | describes the datatype of the variable, either COMIN_VAR_DATATYPE_DOUBLE, COMIN_VAR_DATATYPE_FLOAT or COMIN_VAR_DATATYPE_INT | The default depends on wp (usually DOUBLE) |
multi_timelevel | LOGICAL | Flag. TRUE, if this variable corresponds to an ICON variable with multiple time levels. | .FALSE. |
tracer | LOGICAL | Flag. TRUE, if this is a tracer variable | .FALSE. |
tracer_turb | LOGICAL | Flag. TRUE, if this tracer shall take part in turbulent transport | .FALSE. |
tracer_conv | LOGICAL | Flag. TRUE, if this tracer shall take part in convective transport | .FALSE. |
tracer_hlimit | INTEGER | horizontal limiter | positive definite flux limiter |
tracer_vlimit | INTEGER | vertical limiter | semi-monotonous slope limiter |
tracer_hadv | INTEGER | method for horizontal tracer transport | miura horizontal advection scheme |
tracer_vadv | INTEGER | method for vertical tracer transport | PPM vertical advection scheme |
units | CHARACTER | units (as part of CF metadata convention) | empty string |
standard_name | CHARACTER | standard_name (as part of CF metadata convention) | empty string |
long_name | CHARACTER | long_name (as part of CF metadata convention) | empty string |
short_name | CHARACTER | short_name (as part of CF metadata convention) | empty string |
grib_discipline | INTEGER | discipline parameter (as part of grib2 metadata convention) | 255 |
grib_category | INTEGER | category parameter (as part of grib2 metadata convention) | 255 |
grib_number | INTEGER | number parameter (as part of grib2 metadata convention) | 255 |
In the above table the default value refers to the value ICON receives from ComIn when requesting an additional variables. Please be aware that setting ihadv_tracer, ivadv_tracer, itype_hlimit or itype_vlimit in ICON's &transport_nml overwrites settings coming from ComIn (for the ComIn metadata tracer_hadv, tracer_vadv, tracer_hlimit and tracer_vlimit respecively).
Please also note that new tracers can only be added on cells (not on edges and vertices). The implementation behind the ComIn metadata is a generic key-value storage. As such, any metadata can be added to a variable, from the host model as well as from the plugin. These can, for example, be useful to transfer metadata between different plugins or to set properties which can be used by the same plugin in the following. However, only the above list of metadata is currently evaluated by the host model.
To write output in GRIB2 format, three integer parameters (discipline, category, number) must be set as metadata for the variable; this metadata can also be retrieved when accessing a variable in plugins.
The derived data type t_comin_var_metadata storing the metadata internally is not exposed to the host model or the plugins. ComIn plugins, for example, can access the data members via the subroutine comin_metadata_get.
While the Fortran and Python API of the ComIn can handle generic arguments of type INTEGER, LOGICAL, the C implementation of the interface does not support generic argument data types. Therefore, special variants of this subroutine exist:
Similar to comin_metadata_get, there is also the option to retrieve a user-defined default value in case the metadata is not available using the comin_metadata_get_or method:
Metadata items are identified by a character string key. The data type of a particular metadata item can be retrieved by calling
This auxiliary function yields one of the IDs COMIN_METADATA_TYPEID_UNDEFINED, COMIN_METADATA_TYPEID_INTEGER, COMIN_METADATA_TYPEID_REAL, COMIN_METADATA_TYPEID_CHARACTER or COMIN_METADATA_TYPEID_LOGICAL. Note that COMIN_METADATA_TYPEID_UNDEFINED means that the metadata key has not been set in the container.
On the host model side, the comin_var_request_add operations expects information on the properties of the variable which should be registered. These are provided using the function
If a metadata value cannot be added to a newly requested field a warning message is thrown (similarly also from the host model for its variables). The error code can be evaluated in addition and the plugin can decide to abort the simulation.
For the C implementation, in analogy to the read accessor functions comin_metadata_get_<data type>, there exist special, type-specific write accessor functions comin_metadata_set_<data type>.
If the same variable is added multiple times by different plugins, the previously set metadata is not overwritten by default values. However, when invoking comin_metadata_set_<data type> explicitly, the existing metadata entry is overwritten without warning. Please note that when explicitely overwriting a metadata entry, there is by design no check for consistency between the previously set type and the newly set type. If needed, this check can be done on the plugin side by invoking comin_metadata_get_typeid.
Some applications may find it useful to have some way of inspecting all metadata entries that were set for a certain variable. For example, when writing the state of a variable including all metadata to a file. Thus, ComIn provides an iterator object <iterator_object> for the metadata. This iterator can be obtained by invoking the subroutine comin_metadata_get_iterator and returns an object of type t_comin_var_metadata_iterator. With the above described function comin_metadata_get_typeid and the integer constants COMIN_METADATA_TYPEID_INTEGER, COMIN_METADATA_TYPEID_REAL, COMIN_METADATA_TYPEID_CHARACTER and COMIN_METADATA_TYPEID_LOGICAL, the value associated to a key can be derived. The <iterator_object> provides the following methods:
<key> of the current entry (<iterator_object>key(<key>))<iterator_object>next())<iterator_object>is_end()).An example implementation to write out all integer metadata entries of a variable could be the following:
Please note that it is highly recommended to derive and delete the iterator object within the same scoping unit since changes in the metadata container after deriving the iterator can lead to unexpected behavior when using the iterator. Furthermore, since the metadata is stored unordered, the sequence of metadata is not known a priori and potentially subject to change if there is a change in a key.
The descriptive data structures contain information on the ICON setup (e.g. Fortran KIND values), the computational grid(s), and the simulation status.
All descriptive data structures are treated as read-only (seen from the perspective of the 3rd party plugins). However, this read-only nature is (currently) not enforced. For efficiency reasons, the adapter library directly uses pointers to ICON data structures where possible. This holds mostly for components of p_patch, while non p_patch descriptive data are copied from the host model.
Date and time information (simulation status) is provided as character strings according to ISO 8601. Note that there are exceptions to this rule where in ICON time information is stored internally in seconds (timesteplength per domain from comin_descrdata_get_timesteplength() and dom_start/dom_end from t_comin_descrdata_domain).
All getter functions for descriptive data don't return an code but abort the simulation (call comin_plugin_finish) since their non-existance points to a larger problem. In general in ComIn functions don't return error codes.
The majority of the examples provided cover Fortran. The interface to C are often different and some notes on this are provided in a section on C/C++ and Python interfaces.
Access period: The global data is set by the host as the first descriptive data structure (since it is required for grid information). Global data is available for the 3rd party module's primary constructor and all subsequent subroutine callbacks. Global data is never changed or updated. Global data is invariant w.r.t. the computational grid (logical domain ID).
Global data is encapsulated in a data type t_comin_descrdata_global and can be requested with comin_descrdata_get_global() (returning a POINTER and aborting the simulation if unsuccessful). It is set up by a call to icon_build_global in ICON (calling comin_descrdata_set_global) before the primary constructor. Its internal structure may change between different versions of the adapter library.
List of global data:
| name | data type | description |
|---|---|---|
n_dom | INTEGER | number of logical domains |
max_dom | INTEGER | maximum number of logical domains |
nproma | INTEGER | block size |
wp | INTEGER | KIND value (REAL) |
min_rlcell_int | INTEGER | block index |
min_rlcell | INTEGER | block index |
max_rlcell | INTEGER | block index |
min_rlvert_int | INTEGER | block index |
min_rlvert | INTEGER | block index |
max_rlvert | INTEGER | block index |
min_rledge_int | INTEGER | block index |
min_rledge | INTEGER | block index |
max_rledge | INTEGER | block index |
grf_bdywidth_c | INTEGER | block index |
grf_bdywidth_e | INTEGER | block index |
lrestartrun | LOGICAL | if this simulation is a restart |
vct_a | 1D REAL(dp) array (1:(nlev+1)) | param. A of the vertical coordinate (without topography) |
host_git_remote_url | CHARACTER(LEN=:) | git remote url of the origin repository of the host model |
host_git_branch | CHARACTER(LEN=:) | git branch name of the host model |
host_git_tag | CHARACTER(LEN=:) | git tag of the host model |
host_revision | CHARACTER(LEN=:) | revision of the host model |
Some global data, e.g. the Fortran KIND value information wp, are required by the 3rd party module at compile time. However, due to the loose connection between the 3rd party module and the ICON model via the adapter library, the following implementation procedure is proposed:
KIND value.KIND value is retrieved from the global data of the adapter library. A consistency check may throw a runtime exception.The host model can assert the compatibility of its wp value through the subroutine comin_setup_check().
Access period: Grid information is available for the 3rd party module's primary constructor and all subsequent subroutine callbacks. Grid information is never changed or updated. The data structures in this section are replicated for each computational domain (logical domain ID).
Topological data is encapsulated in a data type t_comin_descrdata_domain and can be requested for domain jg with comin_descrdata_get_domain(jg) (returning a POINTER and aborting the simulation if unsuccessful). It is set up by a call to comin_descrdata_set_domain() in the host model (ICON) before the primary constructor. The internal structure may change between different versions of the adapter library.
The parent and child relationship of nested domains is reflected by the parent_id and the 1D array child_id. The INTEGER parent_id directly relates to the ID of the parent domain. child_id(1:n_childdom) is a list of all child domains directly nested in the domain. To address a parent or child domain, the value jg for this respective domain can directly used to request the domain with comin_descrdata_get_domain(jg).
Structure of type t_comin_descrdata_domain
| name | data type | description |
|---|---|---|
grid_filename | CHARACTER | horizontal grid file name |
grid_uuid | CHARACTER | alphanumerical binary hash, note that this UUID field is not the (slightly longer) hexadecimal UUID string suitable for print-out |
number_of_grid_used | INTEGER | number of grid used (GRIB2 key) |
id | 1D INTEGER array (1:max_dom) | ID of current domain |
parent_id | INTEGER | ID of parent domain |
child_id | 1D INTEGER array (1:n_childdom) | IDs of child domains |
n_childdom | INTEGER | number of child domains |
dom_start | REAL(wp) | model domain start time in elapsed seconds |
dom_end | REAL(wp) | model domain end time in elapsed seconds |
nlev | INTEGER | no. of vertical model levels |
nshift | INTEGER | half level of parent domain that coincides with upper margin of current domain |
nshift_total | INTEGER | total shift of model top w.r.t. global domain |
cells | TYPE(t_comin_descrdata_domain_cells), see below | properties for cells |
verts | TYPE(t_comin_descrdata_domain_verts), see below | properties for vertices |
edges | TYPE(t_comin_descrdata_domain_edges), see below | properties for edges |
Structure of type t_comin_descrdata_domain_cells
| name | data type | description |
|---|---|---|
ncells | INTEGER | no. of local cells |
ncells_global | INTEGER | no. of global cells |
nblks | INTEGER | no. of blocks for cells |
max_connectivity | INTEGER | |
num_edges | 2D INTEGER array (nproma, nblks_c) | number of edges |
refin_ctrl | 2D INTEGER array | lateral boundary distance index |
start_index | 1D INTEGER array | start index |
end_index | 1D INTEGER array | end index |
start_block | 1D INTEGER array | start block for cells |
end_block | 1D INTEGER array | end block for cells |
child_id | 2D INTEGER array (nproma, nblks_c) | domain id of child triangles |
parent_glb_idx | 2D INTEGER array (nproma, nblks_c) | global indices of parent triangles |
parent_glb_blk | 2D INTEGER array (nproma, nblks_c) | global blocks of parent triangles |
vertex_idx | 3D INTEGER array (nproma, nblks_c, 3) | indices of vertices |
vertex_blk | 3D INTEGER array (nproma, nblks_c, 3) | blocks of vertices |
neighbor_idx | 3D INTEGER array (nproma, nblks_c, 3) | indices of neighbors |
neighbor_blk | 3D INTEGER array (nproma, nblks_c, 3) | blocks of neighbors |
edge_idx | 3D INTEGER array (nproma, nblks_c, 3) | indices of edges |
edge_blk | 3D INTEGER array (nproma, nblks_c, 3) | blocks of edges |
clon | 2D REAL(wp) array (nproma, nblks_c) | cell center longitude |
clat | 2D REAL(wp) array (nproma, nblks_c) | cell center latitude |
area | 2D REAL(wp) array (nproma, nblks_c) | triangle area |
hhl | 3D REAL(wp) array (nproma, nlev+1, nblks_c) | geometrical height of half levels at cell center |
Structure of type t_comin_descrdata_domain_verts
| name | data type | description |
|---|---|---|
nverts | INTEGER | no. of local verts |
nverts_global | INTEGER | no. of global verts |
nblks | INTEGER | no. of blocks for verts |
refin_ctrl | 2D INTEGER array | lateral boundary distance index |
start_index | 1D INTEGER array | start index |
end_index | 1D INTEGER array | end index |
start_block | 1D INTEGER array | start block |
end_block | 1D INTEGER array | end block |
neighbor_idx | 3D INTEGER array (nproma, nblks_v, 6) | indices of neighbors |
neighbor_blk | 3D INTEGER array (nproma, nblks_v, 6) | blocks of neighbors |
cell_idx | 3D INTEGER array (nproma, nblks_v, 6) | indices of cells |
cell_blk | 3D INTEGER array (nproma, nblks_v, 6) | blocks of cells |
edge_idx | 3D INTEGER array (nproma, nblks_v, 6) | indices of edges |
edge_blk | 3D INTEGER array (nproma, nblks_v, 6) | blocks of edges |
vlon | 2D REAL(wp) (nproma, nblks_v) | longitude vertex |
vlat | 2D REAL(wp) (nproma, nblks_v) | latitude vertex |
Structure of type t_comin_descrdata_domain_edges
| name | data type | description |
|---|---|---|
nedges | INTEGER | no. of local edges |
nedges_global | INTEGER | no. of global edges |
nblks | INTEGER | no. of blocks for edges |
refin_ctrl | 2D INTEGER array | lateral boundary distance index |
start_index | 1D INTEGER array | start index |
end_index | 1D INTEGER array | end index |
start_block | 1D INTEGER array | start block |
end_block | 1D INTEGER array | end block |
child_id | 2D INTEGER array (nproma, nblks_e) | domain id of child edges |
parent_glb_idx | 2D INTEGER array (nproma, nblks_e) | global indices of parent edges |
parent_glb_blk | 2D INTEGER array (nproma, nblks_e) | global blocks of parent edges |
cell_idx | 3D INTEGER array (nproma, nblks_e, 2) | indices of cells |
cell_blk | 3D INTEGER array (nproma, nblks_e, 2) | blocks of cells |
vertex_idx | 3D INTEGER array (nproma, nblks_e, 4) | indices of vertices |
vertex_blk | 3D INTEGER array (nproma, nblks_e, 4) | blocks of vertices |
elon | 2D REAL(wp) (nproma, nblks_e) | longitude edge midpoint |
elat | 2D REAL(wp) (nproma, nblks_e) | latitude edge midpoint |
Geometrical information is provided as horizontal (cell-wise) data fields, e.g. clon, clat, area. Instead of information about the vertical grid, the plugins may access the ICON variable HHL.
Implicitly, the above tables also contain some information on the parallelization: The data structure contains the information whether the local PE is a compute process owning prognostic grid points.
Explicit information on the parallelization of cells is contained for domain jg in the type t_comin_descrdata_domain_cells.
List of data structures related to parallelization:
| name | data type | description |
|---|---|---|
glb_index | 1D INTEGER array | global cell indices |
decomp_domain | 2D INTEGER array (nproma, nblks_c) | domain decomposition flag |
In addition, the function comin_descrdata_index_lookup_glb2loc_cell() can be used to determine the local index to a corresponding global index.
Access period: The simulation timing info is available for the 3rd party module's primary constructor and all subsequent subroutine callbacks. It is set by a call to comin_descrdata_set_simulation_interval() from the host.
The simulation timing info is provided as ISO 8601 character strings and can be requested with comin_descrdata_get_simulation_interval() (returning a POINTER and aborting the simulation if unsuccessful). Its internal structure may change between different versions of the adapter library.
List of data structures related to the simulation timing info:
| name | data type | description |
|---|---|---|
exp_start | CHARACTER | simulation start time stamp |
exp_stop | CHARACTER | simulation end time stamp |
run_start | CHARACTER | start of this simulation (-> restart) |
run_stop | CHARACTER | stop of this simulation (-> restart) |
The current simulation date time stamp can be obtained as an ISO 8601 string from the accessor subroutine
During the simulation the current date time stamp is updated by a call to comin_current_set_datetime() from the host, it is available beginning with the entry point EP_ATM_TIMELOOP_BEFORE.
To access information on the current entry point being processed by ComIn, the currently executing plugin and the current domain selected in ICON routines are provided from within ComIn. comin_current_get_ep can be called from within a plugin, for example when one procedure is registered for several entry points but slight deviations in behavior between the entry points are necessary.
comin_current_get_plugin_info() gives access to components of the data type t_comin_plugin_info. It can for example be used to access the id of the current plugin. The data type also stores information on the plugin name, associated options and, if present, its communicator.
comin_current_get_domain_id() is provided together with descriptive data as part of the adapter library. A C version of this routine is also available. Callbacks might be called from ICON from the global domain or from any nested domain. The currently selected domain can be accessed via this subroutine.
Another small set of auxiliary built-in subroutines does not communicate with the ICON model but provides common functionality (utilities):
List of auxiliary built-in subroutines and functions:
| name | description |
|---|---|
| comin_descrdata_get_index(), comin_descrdata_get_block() | convert 1D index into nproma-blocked index |
comin_descrdata_get_cell_npromz | length of last block |
comin_descrdata_get_edge_npromz | length of last block |
comin_descrdata_get_vert_npromz | length of last block |
Verbosity level
Following ICON's parameter msg_level, the verbosity of the log output is controlled by an integer value in the ComIn library as well: By means of the auxiliary routine comin_setup_set_verbosity_level() the host model specifies whether log outputs are generated by the MPI process 0 e.g. when passing the entry points or when registering the callback functions. The higher the specified value, the more extensive the output (0=silent, 20=all log messages are output).
Print utilities The plugins can use the auxiliary functions comin_print::comin_print_debug comin_print::comin_print_info and comin_print::comin_print_warning to write out messages to the log. These messages are only printed on process 0 (to appear only once in the log) and only if the corresponding namelist settings are made (see comin_plugin_types::t_comin_plugin_description).
The callback register is part of the ComIn library. It fulfils the following tasks:
This section describes the mechanism of registering new 3rd party modules. We distinguish between two setup routines, a primary constructor and a secondary constructor, both described in the following:
The primary constructor is called before the allocation of ICON variable lists and fields. Its call is automatically triggered by the host model through a call to the subroutine
where
where the maximum character string lengths are defined in a file global.inc (also accessible for C and Python programs).
The rationale behind the type t_comin_plugin_description is to provide a Fortran namelist in the host model, e.g.,
in order to enable/disable the ComIn plugins at runtime.
name we denote a simple string that is used for output purposes related to this plugin.plugin_library we denote the dynamically loaded library (including its file extension .so). If the plugin has been statically linked to the host model, this argument should be skipped or an empty string should be provided.primary_constructor we denote the name of the primary constructor subroutine, the default value is comin_main.comm we denote the name of the MPI communicator that is created for this particular plugin. This is useful when exchanging data with other running processes, see the section on MPI communicators below. The parameter comm can be left as an empty string if the application does not require a communicator for this plugin.options data offers the possibility to pass a character string (e.g. a Python script filename) to the plugin.If multiple 3rd party modules are enabled, the primary constructor calls will be added in the same order as they appear in the comin_nml namelist unless specified otherwise (not possible in the first release).
Remark. The runtime configuration of the ComIn callback library is implemented as the simple t_comin_plugin_description data structure instead of using a special file-based input format, in particular Fortran namelists (or YAML, XML, etc.). This I/O abstraction is motivated by the fact that the configuration could be read from a restart file as well as from an ASCII file in ICON. Other ways of reading the configuration could be introduced by the host model in the future and should not affect the ComIn interfaces.
The setup routine returns the t_comin_plugin_info info that has been used by the 3rd party module at compile time.
During execution,
The module handle is basically a ComIn-internal ID that is used to identify a specific plugin during the subsequent operations. Users do not access the module ID explicitly; later on, for example, the calling module for a callback function can be implicitly identified by the wrapping ComIn handler routine.
The options character string mentioned above becomes available as the options member in t_comin_plugin_description.
Important remark: We strongly advise plugin developers to add proper prefixes to global symbols (variables, functions). This ensures that these symbols remain unique in all variations of library linking.
A secondary constructor is called after the allocation of ICON variable lists and fields and before the time loop.
At the last part of the initialization phase, the callback to a final initialization entry point is called. This gives the plugins an additional entry point to finish their initialization. The entry point is named EP_<COMP>_INIT_FINALIZE, reflecting the fact that this is the place to finalize the initial setup in the plugins.
Entry points denote events during the ICON model simulation, which can trigger a subroutine call of the 3rd party module. Entry points are denoted by named integer constants, e.g.
The set of entry points may change between different versions of the adapter library, but integer constants are defined in a backward compatible fashion. The name of a entry point based on the named integer constant can be determined with a call to comin_callback_get_ep_name.
Conventions:
EP_DESTRUCTOR always denotes the last entry in the enumeration. This easily provides the total number of entry points to ComIn.DOMAIN_OUTSIDE_LOOP instead of the domain id. The information from where in the host code the callback is executed is accessible from ComIn via the comin_current_get_domain_id() routine. It returns the domain id, which can however be DOMAIN_OUTSIDE_LOOP if it encompasses all domains.Note that the adapter library exposes ICON model variables with respect to these entry points, together with in-/out-semantics (see the section on read/write access). Therefore, after the secondary constructor has been processed, the data flow for each entry point and every 3rd party module is known to the callback registry.
The Entry point names and ids are constructed as follows:
EP_<COMP>_<PROCESS|LOOP>_[BEFORE|AFTER|START|END]
<COMP>: the model component, e.g. ATM, OCE, LND...<PROCESS|LOOP>: name of the entry point's corresponding physical process or loop in the model[BEFORE|AFTER]: position of the entry point in the call sequence, before or after the corresponding physical process or loop[START|END]: inside a loop, the entry point at the beginning (right after DO) has suffix START, the entry point at the end (right before END DO) has suffix ENDThe character length of an entry point name cannot exceed MAX_LEN_EP_NAME (currently set to 32), which is defined in include/global.inc.
Exceptions from this naming scheme are EP_SECONDARY_CONSTRUCTOR, EP_FINISH, EP_DESTRUCTOR, and the final entry point of the initialization phase EP_<COMP>_INIT_FINALIZE.
| Entry point ID | description | call interval |
|---|---|---|
EP_SECONDARY_CONSTRUCTOR | secondary constructor, initial phase | once in simulation |
EP_ATM_YAC_DEFCOMP_BEFORE | just before the component definition of yac | once in simulation |
EP_ATM_YAC_DEFCOMP_AFTER | after the component definition of yac | once in simulation |
EP_ATM_YAC_SYNCDEF_BEFORE | just before the config synchronisation of yac | once in simulation |
EP_ATM_YAC_SYNCDEF_AFTER | after the config synchronisation of yac | once in simulation |
EP_ATM_YAC_ENDDEF_BEFORE | just before the end of the config definition of yac | once in simulation |
EP_ATM_YAC_ENDDEF_AFTER | just before the end of the config definition of yac | once in simulation |
EP_ATM_INIT_FINALIZE | end of initial phase | once in simulation |
EP_ATM_TIMELOOP_BEFORE | just before start of the time loop | once in simulation |
EP_ATM_TIMELOOP_START | at the beginning of the time loop | every (global) time step |
EP_ATM_TIMELOOP_END | just before the end of the time loop | every (global) time step |
EP_ATM_TIMELOOP_AFTER | after the time loop is finished | once in simulation |
EP_ATM_INTEGRATE_BEFORE | before the integration is called | every (global) time step |
EP_ATM_INTEGRATE_START | start of the integration loop | every (nested) time step |
EP_ATM_INTEGRATE_END | end of the integration loop | every (nested) time step |
EP_ATM_INTEGRATE_AFTER | after the integration loop | every (global) time step |
EP_ATM_WRITE_OUTPUT_BEFORE | before the call to model output | every (nested) time step |
EP_ATM_WRITE_OUTPUT_AFTER | after the call to model output | every (nested) time step |
EP_ATM_CHECKPOINT_BEFORE | before the call to model's checkpoint writing | checkpoint interval |
EP_ATM_CHECKPOINT_AFTER | after the call to model's checkpoint writing | checkpoint interval |
EP_ATM_ADVECTION_BEFORE | before advection | every (nested) time step |
EP_ATM_ADVECTION_AFTER | after advection | every (nested) time step |
EP_ATM_PHYSICS_BEFORE | before physics | every (nested) time step |
EP_ATM_PHYSICS_AFTER | after physics | every (nested) time step |
EP_ATM_NUDGING_BEFORE | before nudging | every (nested) time step |
EP_ATM_NUDGING_AFTER | after nudging | every (nested) time step |
EP_ATM_SURFACE_BEFORE | before surface scheme | every (nested) time step |
EP_ATM_SURFACE_AFTER | after surface scheme | every (nested) time step |
EP_ATM_TURBULENCE_BEFORE | before turbulence scheme | every (nested) time step |
EP_ATM_TURBULENCE_AFTER | after turbulence scheme | every (nested) time step |
EP_ATM_MICROPHYSICS_BEFORE | before microphysics | every (nested) time step |
EP_ATM_MICROPHYSICS_AFTER | after microphysics | every (nested) time step |
EP_ATM_CONVECTION_BEFORE | before convection | every (nested) time step |
EP_ATM_CONVECTION_AFTER | after convection | every (nested) time step |
EP_ATM_RADIATION_BEFORE | before radiation | every (nested) time step |
EP_ATM_RADIATION_AFTER | after radiation | every (nested) time step |
EP_ATM_RADHEAT_BEFORE | before radiative heating | every (nested) time step |
EP_ATM_RADHEAT_AFTER | after radiative heating | every (nested) time step |
EP_ATM_GWDRAG_BEFORE | before gravity waves | every (nested) time step |
EP_ATM_GWDRAG_AFTER | after gravity waves | every (nested) time step |
EP_FINISH | in the model's finish subroutine | in case of an exception |
EP_DESTRUCTOR | immediately before MPI_Finalize | once in simulation |
Notes:
The primary constructor appends subroutines of the 3rd party module to the callback register via the adapter library subroutine comin_callback_register().
Remarks:
comin_callback_complete.BIND(C) attribute (not recommended).For a specific entry point, each plugin may register only one callback routine. Allowing multiple callbacks per component would require complex extension of the relatively simple ComIn interface, especially if components are allowed to intertwine their callbacks. Advice to users: There is still the possibility to write wrappers (summarizing multiple callbacks), or to register the same 3rd party library as multiple independent ComIn components.
The processing order is important when multiple 3rd party modules are present. Currently, the processing order is specified by the order in which plugins are registered. Additional options to set the processing order are not available in the first release but ordering via runtime settings (Fortran namelists) is planned. The ordering may then also differ between individual entry points.
3rd party plugins may use MPI collective calls to communicate with external processes. To this end, the ComIn library provides dedicated MPI communicators which are accessible via the two functions comin_parallel_get_plugin_mpi_comm() and comin_parallel_get_host_mpi_comm(). In addition, comin_parallel_get_host_mpi_rank() allows to receive information on the rank within the MPI communicator of the host model from within the plugin's callback function.
Here, the different MPI communicators have the following scope:
With the above MPI communicator mpi_comm in combination with the topological data structure above, it is straightforward for 3rd party modules to create other MPI communicators which, e.g., contain all PEs with prognostic grid points (via MPI_COMM_SPLIT).
Note that the C interface for the MPI communicator query functions also provides the (integer/MPI_Fint) Fortran communicator handles instead of the struct MPI_Comm. This solution was chosen deliberately, because if MPI_Comm would appear in the signature of the ComIn function, the #include <mpi.h> would become an MPI dependency for all plugins. C developers can convert the handles using the function MPI_Comm_f2c(...) (#include <mpi.h>).
The ComIn allows plugins to be set PE-wise. This is deliberately provided as an option, for example to support the following use case: A diagnostic subroutine could be attached to the host model to perform some collective MPI operations. Afterwards it would write/plot them with Python - but only on the first PE. In practice, this PE could be a head node (vector host), and it would only need to support this task, as opposed to the other "worker" PEs. An elegant solution here would be to implement two different plugins, a Python plugin for PE#0 and a C plugin for the remaining PEs, using the same plugin communicator.
Problematic situations may occur when both, the ComIn plugins and the host model itself, apply a splitting of MPI communicators. For example, this is the case when the ICON model itself couples to external processes via the YAC coupler and, at the same time, uses the ICON ComIn library.
The ComIn setup therefore uses a procedure for the communicator splitting ("MPI handshake") that has been harmonized with the respective algorithm of the YAC coupler software. It is depicted in the following diagram and is compatible with the reference implementation https://gitlab.dkrz.de/dkrz-sw/mpi-handshake. The example summarizes a situation in which the ICON ocean model couples with an external package "FESOM", while the atmospheric part of ICON uses ComIn to communicate with an MPI process "ComInExternal".
The ICON model offers a single configure option to enable the use of the ComIn library:
./configure --with-comin=${ICON_COMIN_DIR} This option provides the root path of the ComIn adapter library, automatically adding the necessary settings for LIBS and FCFLAGS.The host models remaining FCFLAGS (INCLUDE) and LIBS path are provided as usual to the configure script. As described above, the 3rd party plugins are loaded dynamically at runtime, therefore the respective flags and build options are independent from these settings.
The ComIn library can be build as a static as well as a shared library. The behavior is controlled by the cmake flag -DBUILD_SHARED_LIBS.
In the host model, the compilation of ComIn can be (de-)activated with the preprocessor macro
If a user has a ComIn extension, which uses YAC, YAXT or similar, different versions of these libraries could be introduced while building the plugins and the host model itself.
To avoid potential conflicts, the following installation procedure is suggested for, e.g., YAXT library dependencies:
To use YAC from within a plugin a few things must be taken into account:
YAC has an internal lookup table for resolving IDs into actual YAC datastructures, e.g. the component ID. As YAC is linked statically this lookup table is duplicated if it is linked into both the host model and into the plugin. This can be prevented by adding -Wl,--export-dynamic-symbol=yac_* to the LDFLAGS for the host model as well as for the plugin. This has the effect that all YAC symbols are resolved dynamically and are therefore not duplicated.
ComIn provides access to the YAC instance ID of the host model. See comin_descrdata_types::t_comin_descrdata_global::yac_instance_id. Plugins can use it to register its own YAC components.
If ComIn is configured with COMIN_ENABLE_YAC the replay tool also instanciates YAC and provides access to the YAC instance.
See plugins/python_adapter/examples/yac_example.py for an example how to use YAC in a Python plugin.
The multi-language implementation of the ComIn interfaces makes some assumptions regarding the data types used.
wp == C_DOUBLEINTEGER == INTEGER(C_INT)By wp the selection of the real kind used for global and parallel domain data grids is set in ICON ComIn. Presently ICON ComIn assumes C_DOUBLE as double precision real kind type parameter. Regarding INTEGER == INTEGER(C_INT) note that for most compilers this should be the same.
The implementation covers the majority of routines for C/C++ plugins equivalent to Fortran features (see $BASEDIR/comin/src/comin_plugin_interface.F90 for routines and types accessible to Fortran plugins). The C interface handles nearly all data structures through getter and setting functions. The alternative implementation method, namely the direct exposure of Fortran derived types as C structs via the BIND(C) attribute has not been chosen because the use of Fortran ALLOCATABLE, POINTER or SEQUENCE attributes causes subtle problems. There is the exception of t_comin_var_descriptor which represents the ubiquitous search key for variables. The routines accessible to C/C++ plugins are listed and explained in this section below (the C/C++ routine access is provided via comin.h and sub-header files).
C programming enumeration (enum) types are applied to give access to lists of constants. These incorporate a list of entry points into ICON that is available to 3rd party plugins via comin.h (ENTRY_POINT). Moreover, a list of flags (VARACCESS_FLAG) and a list of integer constants providing an interpretation of the vertical axis are also included. Accessibility in Fortran is granted to C/C++ plugins via the BIND(C) attribute given in the Fortran ENUM statement.
Various auxiliary routines to expose specific grid data and domain information, quantities, such as longitude and latitude data grids, and values via comin_header_c_ext_descrdata_get_domain.h and comin_header_c_ext_descrdata_get_global.h as part of comin.h are provided by specific pointer access routines. These specific grid data quantities, arrays and structures are part of global and domain data structures within ICON. The derived types in ICON ComIn are found in $BASEDIR/comin/src/comin_descrdata.F90. The derived type components are partly allocatable and specified at runtime. Several of them are also defined as Fortran POINTER. Therefore, access is provided to the C/C++ plugins via pointer handles to the overarching data structures establishing read and at times write access via query routines. For example, via the routine comin_descrdata_get_domain_cells the grid cell coordinates and parameters are exposed to C/C++ plugins. Further routines are then employed to provide access to these entities. In particular, comin_descrdata_get_domain_cells_clon and comin_descrdata_get_domain_cells_clat provide access to longitude and latitude coordinates. The C/C++ interfaces are partly generated automatically by Python scripts (comin_build_header_descrdata_get_domain.py, comin_build_header_descrdata_get_global.py and comin_build_linked_lists.py). These scripts are located in the $BASEDIR/comin/utils directory and have to be called from there in case changes in the code affecting the descriptive data structures are implemented.
The Python interface (import comin) registers new callbacks through decorators (@comin.register_callback(<entrypoint>), or short: @comin.<entrypoint>):
The name of the function doesn't play a role in this case.
The interface also provides the data structures and functions
The Python implementation of the variable metadata is written in such a way, that it can be accessed as a dictionary:
Resolution of modules in Python is based on the sys.path variable. Initializing this variable is a somewhat intricate process, as it depends on the path of the interpreter, sys.executable. ComIn, which executes Python code in an embedded environment, requires explicit definition of the path to the Python executable.
By default, ComIn selects the Python executable found by CMake during its configuration phase. This is derived from the PYTHON variable used in ICON's configuration phase. This value can be overridden at runtime by setting the COMIN_PYTHON_EXECUTABLE environment variable.
To use a virtual environment, ensure that sys.executable points to the Python executable located in the bin directory within your virtual environment.
sys.executable locates the correct Python executable from your virtual environment's bin directory.Furthermore, ComIn adds the directory of the plugin script to sys.path this allows the import of modules in the same directory (In analogy to running a python script).
The following entry points are mandatory:
EP_SECONDARY_CONSTRUCTOR: in the initialization phaseEP_DESTRUCTOR: before the model returns from execution (usually before MPI_Finalize)EP_FINISH: before the model returns in case an exception is detectedAdditionally:
EP_<COMP>_INIT_FINALIZE: at the final phase of initialization of the model component <COMP>. This gives the plugins the possibility to finalize their initial setup.The subroutine calls of the callback subroutine ("entry points") should be outside of any IF or CASE constructs related to the host model's physical processes. The callbacks should be executed even if the corresponding physical process is switched off in the host model. If the physical process in the host model is called on a longer interval than the time step in the corresponding model domain (nest), the callback subroutine should be called every time step.
For clarity, it is recommended to enclose each entry point with their own #ifdef environment, even if two entry points follow each other directly.
For each physical process of the host model the corresponding entry points should be included pairwise, before and after the call of the physical process (_BEFORE and _AFTER). For loops, there should be four entry points. Before and after the loop (_BEFORE and _AFTER), and at the beginning and the end of the loop, directly after DO and before END DO (_START, _END).
The descriptive data structures in ComIn are filled by calls of the respective routines from ICON. These are
icon_expose_descrdata_globalicon_expose_descrdata_domainicon_expose_descrdata_stateicon_expose_descrdata_parallelicon_expose_timesteplength_domainand they are called with the input parameters from ICON as
For icon_expose_descrdata_domain in addition to p_patch as patch the vgrid_buffer variable is used to get access to z_ifc, which is not stored in p_nh_state(jg)metricsz_ifc at the time of the primary constructor.
As the simulation status in t_comin_descrdata_state is stored as ISO 8601 character strings a conversion using the datetimeToString procedure is required. This is using components of time_config from mo_time_config.
For icon_expose_descrdata_parallel p_patch is read as patch.
The additional routine expose_timesteplength_domain fills the time steps for each domain.
During a simulation icon_update_descrdata_state() will be called to execute comin_descrdata_update_state() and update sim_current of comin_descrdata_state. In ICON the routine is called from src/atm_dyn_iconam/mo_nh_stepping when mtime_current is updated in perform_nh_timeloop.
To finalize the descriptive data from the host model the routine comin_descrdata_finalize can be called. Note that currently the routine does not contain instructions.
By setting the ICON namelist parameter inwp_gscp=-1, ICON expects an external microphysics scheme, e.g. by a ComIn plugin. The ICON variables QV, QI, QC, QR and QS are mandatory. Thus, the microphysics plugin needs to fill these with meaningful prognostic or diagnostic values.
To summarize the previous sections, the adapter library provides the following data structures and library functions. Built-in subroutines and functions of the adapter library do not access data except their respective arguments.
__NO_ICON_COMIN__