Skip to contents

How rcdo works

The rcdo package does very little. It merely translates R functions into CDO commands that are then executed via system() calls. This makes the package relatively simple but requires the user to have cdo installed separatedly.

Each CDO operator has an equivalent rcdo function, which prefixed with cdo_. So, if you want to use the monmean CDO operator to resample a time series into monthly values, you would use the cdo_monmean() function.

By default, rcdo will use the system installed version. This is safe and convenient, but there’s no guarantee that the CDO version in your system is the same as the CDO version used to generate the current rcdo version. A version mismatch is not critical, since the vast majority of functionality and documentation will be compatible, so rcdo will emit a one-time warning but will otherwise still try to execute commands.

cdo_use("system")  # The default
#> Using system CDO, version 2.4.3.

cdo_install() will try to download, compile and install the “supported” CDO version and then we can use cdo_use("packaged") to tell rcdo to use the package version.

# cdo_install()
cdo_use("packaged")
#> Using packaged CDO, version 2.5.1.

Using rcdo

We will use a sample file.

file <- system.file("extdata", "hgt_ncep.nc", package = "rcdo")

We can get a quick look at the contents of the file with the sinfo (short info) operator using the cdo_sinfo() function.

file |> 
  cdo_sinfo() |> 
  cdo_execute()
#>  [1] "   File format : NetCDF4 classic"                                                      
#>  [2] "    -1 : Institut Source   T Steptype Levels Num    Points Num Dtype : Parameter ID"   
#>  [3] "     1 : NCEP     NCEP/DOE v instant       3   1     10512   1  F32  : -1            " 
#>  [4] "   Grid coordinates :"                                                                 
#>  [5] "     1 : lonlat                   : points=10512 (144x73)"                             
#>  [6] "                              lon : 0 to 357.5 by 2.5 degrees_east  circular"          
#>  [7] "                              lat : 90 to -90 by -2.5 degrees_north"                   
#>  [8] "   Vertical coordinates :"                                                             
#>  [9] "     1 : pressure                 : levels=3"                                          
#> [10] "                            level : 1000 to 500 millibar"                              
#> [11] "   Time coordinate :"                                                                  
#> [12] "                             time : 24 steps"                                          
#> [13] "     RefTime =  1800-01-01 00:00:00  Units = hours  Calendar = standard  Bounds = true"
#> [14] "  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss"  
#> [15] "  2000-01-01 00:00:00  2000-02-01 00:00:00  2000-03-01 00:00:00  2000-04-01 00:00:00"  
#> [16] "  2000-05-01 00:00:00  2000-06-01 00:00:00  2000-07-01 00:00:00  2000-08-01 00:00:00"  
#> [17] "  2000-09-01 00:00:00  2000-10-01 00:00:00  2000-11-01 00:00:00  2000-12-01 00:00:00"  
#> [18] "  2001-01-01 00:00:00  2001-02-01 00:00:00  2001-03-01 00:00:00  2001-04-01 00:00:00"  
#> [19] "  2001-05-01 00:00:00  2001-06-01 00:00:00  2001-07-01 00:00:00  2001-08-01 00:00:00"  
#> [20] "  2001-09-01 00:00:00  2001-10-01 00:00:00  2001-11-01 00:00:00  2001-12-01 00:00:00"

Notice the use of cdo_execute(). Plain rcdo functions return an operation waiting to be executed.

file |> 
  cdo_sinfo() 
#> CDO command:
#>    /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo sinfo [ '/home/user1/Documents/r-packages/rcdo/inst/extdata/hgt_ncep.nc' ] {{output}}

This could seem a bit cumbersome for just one operation, but allows operators to be chained together as we will see later.

sinfo is an operator with zero output files. It returns a string with information. There are other operators like this. For example, if we wanted to know how many vertical levels are in this file, we could use the cdo_nlevel() function.

file |> 
  cdo_nlevel() |> 
  cdo_execute()
#> [1] "3"

For actual data manipulation, we use operators that take a one or more files and return one or more files. For instance, let’s select only the Southern Hemisphere in this dataset with the sellonlatbox operator.

sh <- file |> 
  cdo_sellonlatbox(lon1 = 0, lon2 = 360, lat1 = -90, lat2 = 0)
sh
#> CDO command:
#>    /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo sellonlatbox,0,360,-90,0 [ '/home/user1/Documents/r-packages/rcdo/inst/extdata/hgt_ncep.nc' ] {{output}}

At this point, we haven’t done anything; sh is just an operation waiting to be executed. Because it will return a file, cdo_execute() needs to know where to save the output. We can do it explicitly with the output argument.

sh |> 
  cdo_execute(output = tempfile())
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e7820da3c"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:31 AEST"
#> attr(,"size")
#> [1] 1580009

(The file size and modification date are attached as attributes to the output. This potentially makes it possible to memoise functions based on it).

If we omit that argument, however, rcdo will save the result into a ephemeral file in a temporary folder.

sh_file <- sh |> 
  cdo_execute()
sh_file
#> [1] "/tmp/Rtmp9eAVmx/filee0a5ebc7fbd2"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#> 
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 1580009

This file will be deleted when the variable holding the path is removed.

Since sh is not a file, applying another rcdo function will return a chained set of operations.

sh |> 
  cdo_sinfo() 
#> CDO command:
#>    /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo sinfo [ -sellonlatbox,0,360,-90,0 [ '/home/user1/Documents/r-packages/rcdo/inst/extdata/hgt_ncep.nc' ] ] {{output}}

This is the same as

file |> 
  cdo_sellonlatbox(lon1 = 0, lon2 = 360, lat1 = -90, lat2 = 0) |> 
  cdo_sinfo() 
#> CDO command:
#>    /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo sinfo [ -sellonlatbox,0,360,-90,0 [ '/home/user1/Documents/r-packages/rcdo/inst/extdata/hgt_ncep.nc' ] ] {{output}}

We can execute the chain and confirm that sh only selects the Southen Hemisphere

sh |> 
  cdo_sinfo() |> 
  cdo_execute() |> 
  _[7]
#> [1] "                              lat : 0 to -90 by -2.5 degrees_north"

It’s more interesting to chain multiple data-manipulating operations. For example, let’s select only the 500hPa level.

sh_500 <- sh |> 
  cdo_sellevel(500) |> 
  cdo_execute() 

We can confirm that the result only has 1 level.

sh_500 |> 
  cdo_nlevel() |> 
  cdo_execute()
#> [1] "1"

Other operators take more than one file as arguments. ymonsub subtracts two files matching the same month of year. It’s mainly used to compute monthly anomalies by first computing monthly climatology with ‘ymonmean’.

climatology <- cdo_ymonmean(file)

anomalies <- cdo_ymonsub(file, climatology) |> 
  cdo_execute()

Some operators take one file and return an undetermined number of files. splitmon will return one file per month. Unfortunately rcdo cannot return the list of files created yet. The returned string is the base suffix shared by all files.

mon_split <- sh_500 |> 
  cdo_splitmon() |> 
  cdo_execute()
mon_split
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a7"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected

We can get a list of all files by globbing with an asterisk.

mon_split <- paste0(mon_split, "*") |> 
  Sys.glob()
mon_split
#>  [1] "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a701.nc" "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a702.nc" "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a703.nc"
#>  [4] "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a704.nc" "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a705.nc" "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a706.nc"
#>  [7] "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a707.nc" "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a708.nc" "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a709.nc"
#> [10] "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a710.nc" "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a711.nc" "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a712.nc"

(Note that this is not entirely reliable since it assumes that there are no other files that share the same suffix.)

And now we can use the files normally. These will not be automatically deleted by R (although they will eventually be deleted by your OS is they are in the correct temporary folder).

mon_split[1] |> 
  cdo_sinfo() |> 
  cdo_execute() |> 
  _[11:15]
#> [1] "   Time coordinate :"                                                                  
#> [2] "                             time : 2 steps"                                           
#> [3] "     RefTime =  1800-01-01 00:00:00  Units = hours  Calendar = standard  Bounds = true"
#> [4] "  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss"  
#> [5] "  2000-01-01 00:00:00  2001-01-01 00:00:00"

We can use functional programming to apply one or more operations to each file.

mon_split |> 
  lapply(cdo_deltat)
#> [[1]]
#> CDO command:
#>    /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a701.nc' ] {{output}} 
#> 
#> [[2]]
#> CDO command:
#>    /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a702.nc' ] {{output}} 
#> 
#> [[3]]
#> CDO command:
#>    /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a703.nc' ] {{output}} 
#> 
#> [[4]]
#> CDO command:
#>    /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a704.nc' ] {{output}} 
#> 
#> [[5]]
#> CDO command:
#>    /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a705.nc' ] {{output}} 
#> 
#> [[6]]
#> CDO command:
#>    /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a706.nc' ] {{output}} 
#> 
#> [[7]]
#> CDO command:
#>    /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a707.nc' ] {{output}} 
#> 
#> [[8]]
#> CDO command:
#>    /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a708.nc' ] {{output}} 
#> 
#> [[9]]
#> CDO command:
#>    /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a709.nc' ] {{output}} 
#> 
#> [[10]]
#> CDO command:
#>    /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a710.nc' ] {{output}} 
#> 
#> [[11]]
#> CDO command:
#>    /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a711.nc' ] {{output}} 
#> 
#> [[12]]
#> CDO command:
#>    /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a712.nc' ] {{output}}

To execute a list of operations, use cdo_execute_list().

mon_split |> 
  lapply(cdo_deltat) |> 
  cdo_execute_list()
#> [[1]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e2a5212ea"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#> 
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[2]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e7ec35c5"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#> 
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[3]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e3048379e"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#> 
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[4]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5ef5e0588"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#> 
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[5]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e10de3ecb"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#> 
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[6]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e3c103371"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#> 
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[7]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e16d6377f"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#> 
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[8]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e56fd5ceb"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#> 
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[9]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e9ebb1b8"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#> 
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[10]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e6b70a80a"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#> 
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[11]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5ec9c1f12"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#> 
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[12]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e11dec2fd"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#> 
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545

We could’ve also have executed each operation inside the lapply call. However, because of the way lapply combines outputs, the ephemeral files will be deleted. So you need to either use cdo_execute_list() or take care of explicitly creating temporary files.

mon_split |> 
  lapply(function(x) cdo_deltat(x) |> cdo_execute(output = tempfile()))
#> [[1]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e17bdb9ec"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[2]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e617c3903"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[3]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e27553394"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[4]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e7e6fdf0b"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[5]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e437efdf4"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[6]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e280ffaa8"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[7]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e1898f2bc"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[8]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e72aeb36b"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[9]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e6cb2b04f"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[10]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e32470d4b"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[11]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e7f5c2b0f"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#> 
#> [[12]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e380afe7a"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545

Finally, some operators take a vector of files and return a single file. We can re-merge the list of files with cdo_mergetime().

merged <- mon_split |> 
  cdo_mergetime() |> 
  cdo_execute()
merged |> 
  cdo_ntime() |> 
  cdo_execute()   
#> [1] "24"

Because rcdo can chain operations, there is no need of executing the individual operations and then merging. cdo_mergetime() can take a list of operations naturally, so we could do this

mon_split |> 
  lapply(cdo_deltat) |> 
  cdo_mergetime() |> 
  cdo_execute()
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e79085d5b"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#> 
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 295001

Important limitations

Important

The whole rcdo package and its documentation is built automatically from the CDO source. This comes with some limitations.

CDO operators are documented in “families” where all parameters are documented together. Currently there is no way for the build process to correctly attribute each parameter to the correct operator. This unfortunately means that rcdo functions have every argument from a particular family. For example, the functions cdo_selindexbox() and cdo_sellonlatbox() both have lon1 in their signature.

The conversion from rcdo function arguments to CDO command parameters is pretty dumb. Argument names re only for the user convenience and are not really used: function arguments are converted to CDO parameters simply in the order they are defined in the function signature. Neither are argument checked for validity. This means that cdo_sellonlatbox(file, lon1 = 0) will not return an error even though the operator is missing other necessary arguments. Also cdo_sellonlatbox(file, idx1 = 0) is identical to cdo_sellonlatbox(file, lon1 = 0).

Some CDO operators need named parameters. rcdo currently doesn’t know how to deal with that, so you need to pass the names yourself. So cdo_select(file, name = "temperature") doesn’t work, you need to do cdo_select(file, name = "name=temperature"), which is equivalent to cdo_select(file, code = "name=temperature") due to the previously described limitation.

Some parameters need to be quoted. For example, the expression in the expr operator needs to be surrounded by quotes. Again, rcdo doesn’t know this so you must “double quote” the argument yourself; i.e. cdo_expr(file, "'t_celcius=t-273.15'").

Some documentation formatting might be incorrect. If you spot some part of the documentation that didn’t survive the conversion, open an issue!