Build instructions for ComIn plugins

ComIn plugins are shared libraries, attached to ICON using the dynamic loader of the operating system. For building a ComIn plugin we recommend to use CMake. In the first step you should create a separate CMake project and place your plugin there.

In the next step, one must build ComIn. We strongly recommend the out-of-source build (the instructions can be found in the next section). Following that, ComIn offers a CMake config (ComInConfig.cmake) such that it can be easily found in your CMake project. Then, establish a connection between your CMake project and ComIn.

cd your_project

Plugins use a header-only (+ mod-files) ComIn::Plugin library that provides prototypes for all ComIn functions but no code. The host binary, ICON or the replay tool, is expected to provide symbols that the dynamic loader can use. In the next step, generate a CMakeLists.txt in your CMake project with the following lines:

project(name_of_your_project LANGUAGES Fortran)
find_package(ComIn)
add_library(your_plugin MODULE your_plugin.F90)
target_link_libraries(your_plugin ComIn::Plugin)

Note: In the example above is assumed you want to build a Fortran plugin. In case of C plugin, the LANGUAGES does not need to be specified. Afterwards, you can create a build directory and build your plugin:

mkdir build && cd build
export ComIn_DIR=path_to_the_comin_build_directory
cmake ..
make

Modifying the model state

Plugins are granted access to the same fields as ICON itself. The pointers that are handed out point to the same memory that ICON's own parametrizations use. That makes it easy to modify any variable to change the trajectory of the simulation. However, there are some exceptions and caveats.

The wind variables u and v, and the temperature temp are not the prognostic variables of ICON's dynamics. Instead, they are converted from the edge-normal wind speed vn, and the virtual potential temperature theta_v and exner function, respectively. This constrains their use for updates.

Direct updates to the temperature can only be done from EP_ATM_SURFACE_BEFORE to EP_ATM_MICROPHYSICS_AFTER. Before, they get overwritten by the conversion from the prognostic variables. After (i.e. in the slow-physics part), the temperature has already been written back and further changes are not transferred to the prognostic variables. Note that changes can still have an observable effect because they break the assumption of consistency between temp, theta_v and exner. Later updates must use one of the slow-physics tendencies (which may not be reset to zero every time step). This also applies to changes to the water variables qv, qc, qi, which affect the conversion back to virtual temperature.

Updates to wind variables must be reflected in the variables themselves and in the turbulence tendencies ddt_u_turb, etc. That is necessary because the time loop adds the converted tendecies to vn in an effort to minimize artifacts from interpolating back and forth between u, v, and vn. In general, fast-physics updates should also update the corresponding tendency because some parametrizations might depend on them. Slow physics must only update the slow-physics tendencies from the SSO, GWD, and convection scheme, not the variables themselves.

In particular, wind tendencies must be updated in or after EP_ATM_TURBULENCE_AFTER because the scheme overwrites all previous contents. If the plugin changes the wind profile strongly, it may be beneficial to put the update to the variables themselves in EP_ATM_TURBULENCE_BEFORE to smooth out the additional gradients.

The updates are only applied when the respective part of the physics loop actually runs. This is usually not a problem for fast physics (unless all fast physics processes are disabled), but slow physics processes run at reduced rates. Thus, there might be time steps where none of the slow processes run and no tendencies get added. Users have to make sure that no plugin updates get lost. That can be done by providing the plugin with the calling frequency through, e.g., a configuration file, or by forcing one of the slow-physics parameterizations to run every time step.

Adding custom wind and temperature tendencies for plugins has been discussed and rejected because of the maintenance load due to the ICON code modifications that would be required to implement them.

Testing Plugins

ComIn provides cmake functions to setup tests with CTest.

comin_add_replay_data

This command registers input data for replay tests. The data must be provided as a download-link in a .tar.gz package. The test_data is downloaded in the target download_test_data which is automatically added as a FIXTURES_SETUP test in the test suite.

comin_add_replay_data(NAME <name>
                      URL <url>
                      MD5HASH <md5hash>
  )

NAME: name of the data. This name must be passed as REPLAY_DATA argument to comin_add_replay_test
URL: download url of the data. Must be a compressed tarball (tar.gz).
MD5HASH: MD5 hashsum to verify the download

ComIn provides predefined datasets:

test_nwp_R02B04 (1 processes, 1 domain)
test_nwp_R02B04_R02B05_nest (3 processes, 2 domains)

You can find information on how to create the replay datasets below.

comin_add_replay_test

This is a wrapper around add_test that adds a test to the project and takes care of the configuration of the comin_replay.

comin_add_replay_test(NAME <name>
                     [NUM_PROCS <num_procs>]
                     [REFERENCE_OUTPUT <dir>])

NAME: name of the test (forwarded to add_test)
REPLAY_DATA: NAME of the data created with comin_add_replay_data (alternative to REPLAY_DATA_PATH)
REPLAY_DATA_PATH: Path of replay data (alternative to REPLAY_DATA)
NUM_PROCS: Number of processes
REFERENCE_OUTPUT: Path of reference output (optional)
FILTER_REGEX: Regular expressions for filtering the output. sed style (e.g. s/bad/good/g) (optional, multi value)

comin_test_add_plugin

Adds a plugin in the comin_nml of a given test. The arguments are forwarded to the comin_nml

comin_test_add_plugin(TEST <test>
                      NAME <name>
                      [PLUGIN_LIBRARY <filename>]
                      [PRIMARY_CONSTRUCTOR <functionname>]
                      [OPTION <string>]
                      [COMM <string>]
                    )

TEST: name of the test the plugin should be added to
NAME: name of the plugin
PLUGIN_LIBRARY: filename of the shared object of the plugin (if any)
PRIMARY_CONSTRUCTOR: name of the primary constructor (default: comin_main)
OPTIONS: a options string passed to the plugin (default: "")
COMM: name of the plugin communicator (defualt: "", meaning no communicator is created for the plugin)

comin_test_add_external_process

Add external process to the test. The processes are appended the mpirun command.

comin_test_add_external_process(TEST <test>
  [NUM_PROCS <n>]
  COMMAND <command>
)

TEST: name of the test the plugin should be added to
NUM_PROCS: Number of external processes
COMMAND: command to be executed on the additional processes

Record & Replay functionality

In principle the technical development process of a plugin can be carried out without the presence of the ICON host model, see the section "Build Instructions for ComIn Plugins" below. In general, however, the objectives of a plugin are too complex and involve too many variables to allow development independent of an ICON run. On the other hand, starting the complete model run is resource and time intensive, which in turn limits the plugin development. For this case, the ICON Community Interface offers the replay_tool tool, which is described in the following.

Replay tool

Located in the replay_tool subdirectory, a small executable comin_replay is distributed together with the ComIn source code. This tool performs the task of making previously recorded data sets (ICON variables, descriptive data) available for ComIn plugins. It therefore acts as a fake host model, which simply reproduces data records that were previously captured with the ICON model.

Let us assume, for the time being, that such data records already exist. They are stored in the NetCDF file format, and it is implicitly assumed that the replay process is executed with as many processes as the ICON simulation itself (each NetCDF file stores the partition of a single MPI task).

To illustrate the replay process, we attach the "simple_python_plugin" explained above to comin_replay. This will enable us to develop additional functionality in the "simple_python_plugin", using real data input recorded from the ICON model.

The replay tool expects a fortran namelist as command-line argument, that contains a definition of the comin_nml and a replay_tool_nml. It looks quite similar to the usual comin_nml for ICON, with an additional plugin libcomin_var_replay_plugin.so that loads the variables back into memory. Note that there is no need to specify the list of available variables or the entry point where the variable has been recorded - this is automatically retrieved from the recorder NetCDF files.

&replay_tool_nml
  replay_data_path = "path/to/the/replay_data/"
  msg_level        = 42
/
&comin_nml
  plugin_list(1)%name           = "var_replay_plugin"
  plugin_list(1)%plugin_library = "$COMIN_DIR/build/replay_tool/libcomin_var_replay_plugin.so"
  plugin_list(2)%name           = "simple_python_plugin"
  plugin_list(2)%plugin_library = "libpython_adapter.so"
  plugin_list(2)%options        = "$PLUGINDIR/simple_python_plugin.py"
/

Execution happens in a run script with the same parallel queue settings as the ICON run. You might, for example, create a copy of the full ICON run script, simply replacing the MODEL=.../icon setting by the comin_replay executable. Note, however, that usually the ICON model run comprises additional MPI tasks, e.g., for asynchronous output writing. Therefore, the number of MPI tasks has to be decreased accordingly for the replay run by adjusting the --ntasks argument of the srun command.

Note: It is currently not supported to add the var_replay_plugin plugin multiple times to the comin_replay run.

Recorder plugins

Two separate plugins are provided which capture the ICON host model data during a full model run. Both are located in the replay_tool subdirectory and compiled during ComIn's build process:

build/replay_tool/libcomin_run_recorder_plugin.so: This plugin dumps all descriptive data to disk. It is attached to the ICON model as a secondary constructor callback, which collects most of the descriptive data. During the remaining callbacks, additional time-dependent descriptive data is recorded.
build/replay_tool/libcomin_var_recorder_plugin.so: This plugin captures the data arrays of a given set of variables and for a given entry point. Before you attach this plugin to the ICON model run, the hard-coded entry point constant ep has to be set in the source code file replay_tool/comin_var_recorder_plugin.F90. The list of variables which shall be recorded is provided as a comma-separated string via the namelist parameter comin_nml :: plugin_list(1)options.

Example: The default entry point in the ´comin_var_recorder_plugin.F90 is ep = EP_ATM_TIMELOOP_END. We change this to EP_ATM_WRITE_OUTPUT_BEFORE and rebuiild the recorder plugins.

Afterwards, we can activate the recorder plugins with the following ICON namelist setting, capturing the pressure field pres:

&comin_nml
  plugin_list(1)%name           = "run_recorder_plugin"
  plugin_list(1)%plugin_library = "$COMIN_DIR/build/replay_tool/libcomin_run_recorder_plugin.so"
  plugin_list(2)%name           = "var_recorder_plugin"
  plugin_list(2)%plugin_library = "$COMIN_DIR/build/replay_tool/libcomin_var_recorder_plugin.so"
  plugin_list(2)%options        = "pres"
/

During the ICON run, various NetCDF files are created in the experiments folder; the usual path would be build/experiments/....

The descriptive data files have the name<prefix>XXX.nc , where XXX denotes the MPI rank and <prefix> is an optional name prefix.
The variable contents are stored in files vars_XXX.nc, where XXX denotes the MPI rank.

All files contain certain meta-data attributes, e.g. the ComIn version. As described above, they can now be used by the comin_replay tool to facilitate stand-alone plugin runs for development.

Note: Currently, collecting ICON data for multiple entry points requires several independent model runs.

Using GPUs

ICON supports massively parallel accelerator devices such as GPUs (Graphics Processing Units). For a detailed description of this parallelization model, see the ICON tutorial (DOI: 10.5676/DWD_pub/nwv/icon_tutorial2025), Section 8.5 "ICON on Accelerator Devices". In the context of the Community Interface, the most important aspect is the handling of the separate GPU and CPU memory. ComIn provides a set of API functions that organize the transfer of variables between a plugin running on either CPU or GPU and a GPU-based ICON simulation. This section describes the use of this API by means of a practical example, executed on the Levante supercomputer of the DKRZ.

Preparation: GPU-enabled ICON binary

As a first step, we set up a Python virtual environment called venv. We need this environment to install the CuPy library for GPU-accelerated computing. CuPy shares the same API set as NumPy and SciPy, allowing it to be a drop-in replacement for running NumPy/SciPy code on the GPU.

/sw/spack-levante/miniforge3-24.11.3-2-Linux-x86_64-rf4err/bin/python -m venv venv
source venv/bin/activate
pip install numpy setuptools cupy-cuda12x

The next step is to build a GPU-enabled ICON binary on Levante. We clone the ICON repository,

module load git
git clone git@gitlab.dkrz.de:icon/icon-model.git
cd icon-model/
git submodule update --init --recursive

and create a build directory. Here, we already copy a suitable ComIn+GPU test script to the run/ folder. This test exp.atm_tracer_Hadley_comin_portability is provided together with the ICON source code and runs a ComIn plugin written in the Python programming language, externals/comin/plugins/python_adapter/test/gpu_test.py.

mkdir build && cd build/

cp ../run/checksuite.infrastructure/comin/exp.atm_tracer_Hadley_comin_portability ../run/

As usual with ICON, users are recommended to run an appropriate platform- or machine-specific configuration wrapper that sets the required compiler and linker flags. This is the config/dkrz/levante.gpu script in this case:

../config/dkrz/levante.gpu.nvhpc-24.7 --enable-comin --disable-jsbach --disable-quincy --disable-rte-rrtmgp --enable-bundled-python=comin --disable-silent-rules

make -j16

Now all the necessary preparations are done and we have a GPU-enabled ICON binary along with a ComIn Python adapter. Note that these preparations only need to be done once.

Generate the run script with the following command:

./make_runscripts --all

Then adjust the account in run/exp.atm_tracer_Hadley_comin_portability.run. Afterwards, to test the setup:

sbatch run/exp.atm_tracer_Hadley_comin_portability.run

GPU-enabled Python plugin: gpu_test.py

In the following, we will focus on the plugin script itself, i.e. the Python script build/externals/comin/plugins/python_adapter/test/gpu_test.py. In particular, we will explain ComIn's host device copy mechanism, summarized in the following table:

	CPU plugin (Python: NumPy)	GPU plugin (Python: CuPy)
GPU ICON	auto. host-device copy	no memcopy required
CPU ICON	no memcopy required	- model abort -

ComIn initially assumes that the plugin is to be executed on the CPU host, regardless of whether the ICON model runs on GPUs or not. This basic setting enables the execution of unmodified "legacy" plug-ins. If the test script is unmodified, fields are copied to the host prior to the ComIn callback.

Besides this, the plugin developer can check the availability of the GPU using the has_device descriptive data info. The following statement prints if an accelerator device is available:

glb = comin.descrdata_get_global()

print(f"{glb.has_device=}", file=sys.stderr)

comin.descrdata_get_global

_descrdata descrdata_get_global()

returns global descriptive data object

Definition comin.py:205

See the Python API for other GPU-related contained in the ComIn descriptive data.

If the plugin is also to run on GPUs (which is often possible without major adjustments thanks to the NumPy replacement module CuPy), then read access to a variable on the GPU can be specified using a flag for comin.var_get: comin.COMIN_FLAG_READ | comin.COMIN_FLAG_DEVICE.

pres = comin.var_get([comin.EP_ATM_WRITE_OUTPUT_BEFORE], ("pres", 1),

comin.COMIN_FLAG_READ | comin.COMIN_FLAG_DEVICE)

comin.var_get

var_get(List[entry_point] context, Tuple[str, int] var_descriptor, int flag)

get variable object, arguments: [entry point], (name string, domain id), access flag)

Definition comin.py:146

ComIn also catches the error that a plugin should run on the GPU, although ICON itself was not started on GPUs.

As explained above, we can use the CuPy library as a drop-in replacement for running NumPy code on the GPU. To do this, our Python example replaces

import numpy as xp

with

import cupy as xp

Our test script automatically recognizes the availability of Cupy, enabling it to run in both execution modes: as a GPU or CPU plugin.

We will now discuss write access to ICON variables. The non-trivial case would be when ICON runs on GPUs, yet a non-GPU plugin needs to write to a variable array. Similar to read access, the porting effort is minimal. A non-GPU plugin that uses ICON on GPUs only needs to set comin.COMIN_FLAG_WRITE; then, comin_callback_context_call automatically takes care of the data transfers. This corresponds to the first column of the following table ("Execution in GPU Section"):

`COMIN_FLAG_DEVICE`	`COMIN_FLAG_READ`	`COMIN_FLAG_WRITE`	execution in GPU section	ex. in CPU section (`lacc=.FALSE.`)
	x		update host memory before callback	-
		x	update device memory after callback	-
	x	x	update host memory before callback; update device memory after callback	-
x	x		-	warning
x		x	-	warning
x	x	x	-	warning

Detail: The right column, "execution in CPU section", on the other hand, refers to sections in the ICON code which have not yet been ported to GPUs. In the (rare) case that an entry point is located in such a section, the access to GPU-enabled variables triggers a warning in ComIn, but no special host-to-device update for read and device-to-host for write access has been implemented.

One final remark on descriptive data structures: Descriptive data structures, such as domain data, are not automatically transferred to the GPU device. Plugins only receive CPU pointers. This is because data structures are complex and plugins do not register for specific items. For GPUs, keeping all data consistent would be necessary, even if much of it is unnecessary.

On the other hand, it is relatively straightforward to synchronise the descriptive data arrays on the fly on the plugin side, and the runtime overhead is negligible since most plugins only require these structures during initialization. We demonstrate this in the following code snippet:

domain = comin.descrdata_get_domain(1)

comin.print_info(f"{np.asarray(domain.cells.clon[:10])=}")

comin.descrdata_get_domain

_descrdata descrdata_get_domain(int jg)

returns descriptive data for a given domain, arguments: jg

Definition comin.py:200

Similarly, it is possible to create a CuPy array and move the data to the current GPU device: