How rcdo works
The rcdo package does very little. It merely translates R functions
into CDO commands that are then executed via system()
calls. This makes the package relatively simple but requires the user to
have cdo installed separatedly.
Each CDO operator has an equivalent rcdo function, which prefixed
with cdo_
. So, if you want to use the monmean
CDO operator to resample a time series into monthly values, you would
use the cdo_monmean()
function.
By default, rcdo will use the system installed version. This is safe and convenient, but there’s no guarantee that the CDO version in your system is the same as the CDO version used to generate the current rcdo version. A version mismatch is not critical, since the vast majority of functionality and documentation will be compatible, so rcdo will emit a one-time warning but will otherwise still try to execute commands.
cdo_use("system") # The default
#> Using system CDO, version 2.4.3.
cdo_install()
will try to download, compile and install
the “supported” CDO version and then we can use
cdo_use("packaged")
to tell rcdo to use the package
version.
# cdo_install()
cdo_use("packaged")
#> Using packaged CDO, version 2.5.1.
Using rcdo
We will use a sample file.
file <- system.file("extdata", "hgt_ncep.nc", package = "rcdo")
We can get a quick look at the contents of the file with the
sinfo
(short info) operator using the
cdo_sinfo()
function.
file |>
cdo_sinfo() |>
cdo_execute()
#> [1] " File format : NetCDF4 classic"
#> [2] " -1 : Institut Source T Steptype Levels Num Points Num Dtype : Parameter ID"
#> [3] " 1 : NCEP NCEP/DOE v instant 3 1 10512 1 F32 : -1 "
#> [4] " Grid coordinates :"
#> [5] " 1 : lonlat : points=10512 (144x73)"
#> [6] " lon : 0 to 357.5 by 2.5 degrees_east circular"
#> [7] " lat : 90 to -90 by -2.5 degrees_north"
#> [8] " Vertical coordinates :"
#> [9] " 1 : pressure : levels=3"
#> [10] " level : 1000 to 500 millibar"
#> [11] " Time coordinate :"
#> [12] " time : 24 steps"
#> [13] " RefTime = 1800-01-01 00:00:00 Units = hours Calendar = standard Bounds = true"
#> [14] " YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss"
#> [15] " 2000-01-01 00:00:00 2000-02-01 00:00:00 2000-03-01 00:00:00 2000-04-01 00:00:00"
#> [16] " 2000-05-01 00:00:00 2000-06-01 00:00:00 2000-07-01 00:00:00 2000-08-01 00:00:00"
#> [17] " 2000-09-01 00:00:00 2000-10-01 00:00:00 2000-11-01 00:00:00 2000-12-01 00:00:00"
#> [18] " 2001-01-01 00:00:00 2001-02-01 00:00:00 2001-03-01 00:00:00 2001-04-01 00:00:00"
#> [19] " 2001-05-01 00:00:00 2001-06-01 00:00:00 2001-07-01 00:00:00 2001-08-01 00:00:00"
#> [20] " 2001-09-01 00:00:00 2001-10-01 00:00:00 2001-11-01 00:00:00 2001-12-01 00:00:00"
Notice the use of cdo_execute()
. Plain rcdo functions
return an operation waiting to be executed.
file |>
cdo_sinfo()
#> CDO command:
#> /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo sinfo [ '/home/user1/Documents/r-packages/rcdo/inst/extdata/hgt_ncep.nc' ] {{output}}
This could seem a bit cumbersome for just one operation, but allows operators to be chained together as we will see later.
sinfo
is an operator with zero output files. It returns
a string with information. There are other operators like this. For
example, if we wanted to know how many vertical levels are in this file,
we could use the cdo_nlevel()
function.
file |>
cdo_nlevel() |>
cdo_execute()
#> [1] "3"
For actual data manipulation, we use operators that take a one or
more files and return one or more files. For instance, let’s select only
the Southern Hemisphere in this dataset with the
sellonlatbox
operator.
sh <- file |>
cdo_sellonlatbox(lon1 = 0, lon2 = 360, lat1 = -90, lat2 = 0)
sh
#> CDO command:
#> /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo sellonlatbox,0,360,-90,0 [ '/home/user1/Documents/r-packages/rcdo/inst/extdata/hgt_ncep.nc' ] {{output}}
At this point, we haven’t done anything; sh
is just an
operation waiting to be executed. Because it will return a file,
cdo_execute()
needs to know where to save the output. We
can do it explicitly with the output
argument.
sh |>
cdo_execute(output = tempfile())
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e7820da3c"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:31 AEST"
#> attr(,"size")
#> [1] 1580009
(The file size and modification date are attached as attributes to the output. This potentially makes it possible to memoise functions based on it).
If we omit that argument, however, rcdo will save the result into a ephemeral file in a temporary folder.
sh_file <- sh |>
cdo_execute()
sh_file
#> [1] "/tmp/Rtmp9eAVmx/filee0a5ebc7fbd2"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#>
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 1580009
This file will be deleted when the variable holding the path is removed.
Since sh
is not a file, applying another rcdo function
will return a chained set of operations.
sh |>
cdo_sinfo()
#> CDO command:
#> /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo sinfo [ -sellonlatbox,0,360,-90,0 [ '/home/user1/Documents/r-packages/rcdo/inst/extdata/hgt_ncep.nc' ] ] {{output}}
This is the same as
file |>
cdo_sellonlatbox(lon1 = 0, lon2 = 360, lat1 = -90, lat2 = 0) |>
cdo_sinfo()
#> CDO command:
#> /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo sinfo [ -sellonlatbox,0,360,-90,0 [ '/home/user1/Documents/r-packages/rcdo/inst/extdata/hgt_ncep.nc' ] ] {{output}}
We can execute the chain and confirm that sh
only
selects the Southen Hemisphere
sh |>
cdo_sinfo() |>
cdo_execute() |>
_[7]
#> [1] " lat : 0 to -90 by -2.5 degrees_north"
It’s more interesting to chain multiple data-manipulating operations. For example, let’s select only the 500hPa level.
sh_500 <- sh |>
cdo_sellevel(500) |>
cdo_execute()
We can confirm that the result only has 1 level.
sh_500 |>
cdo_nlevel() |>
cdo_execute()
#> [1] "1"
Other operators take more than one file as arguments.
ymonsub
subtracts two files matching the same month of
year. It’s mainly used to compute monthly anomalies by first computing
monthly climatology with ‘ymonmean’.
climatology <- cdo_ymonmean(file)
anomalies <- cdo_ymonsub(file, climatology) |>
cdo_execute()
Some operators take one file and return an undetermined number of
files. splitmon
will return one file per month.
Unfortunately rcdo cannot return the list of files created yet. The
returned string is the base suffix shared by all files.
mon_split <- sh_500 |>
cdo_splitmon() |>
cdo_execute()
mon_split
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a7"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
We can get a list of all files by globbing with an asterisk.
mon_split <- paste0(mon_split, "*") |>
Sys.glob()
mon_split
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a701.nc" "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a702.nc" "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a703.nc"
#> [4] "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a704.nc" "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a705.nc" "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a706.nc"
#> [7] "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a707.nc" "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a708.nc" "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a709.nc"
#> [10] "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a710.nc" "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a711.nc" "/tmp/Rtmp9eAVmx/filee0a5e44a2b5a712.nc"
(Note that this is not entirely reliable since it assumes that there are no other files that share the same suffix.)
And now we can use the files normally. These will not be automatically deleted by R (although they will eventually be deleted by your OS is they are in the correct temporary folder).
mon_split[1] |>
cdo_sinfo() |>
cdo_execute() |>
_[11:15]
#> [1] " Time coordinate :"
#> [2] " time : 2 steps"
#> [3] " RefTime = 1800-01-01 00:00:00 Units = hours Calendar = standard Bounds = true"
#> [4] " YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss"
#> [5] " 2000-01-01 00:00:00 2001-01-01 00:00:00"
We can use functional programming to apply one or more operations to each file.
mon_split |>
lapply(cdo_deltat)
#> [[1]]
#> CDO command:
#> /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a701.nc' ] {{output}}
#>
#> [[2]]
#> CDO command:
#> /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a702.nc' ] {{output}}
#>
#> [[3]]
#> CDO command:
#> /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a703.nc' ] {{output}}
#>
#> [[4]]
#> CDO command:
#> /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a704.nc' ] {{output}}
#>
#> [[5]]
#> CDO command:
#> /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a705.nc' ] {{output}}
#>
#> [[6]]
#> CDO command:
#> /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a706.nc' ] {{output}}
#>
#> [[7]]
#> CDO command:
#> /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a707.nc' ] {{output}}
#>
#> [[8]]
#> CDO command:
#> /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a708.nc' ] {{output}}
#>
#> [[9]]
#> CDO command:
#> /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a709.nc' ] {{output}}
#>
#> [[10]]
#> CDO command:
#> /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a710.nc' ] {{output}}
#>
#> [[11]]
#> CDO command:
#> /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a711.nc' ] {{output}}
#>
#> [[12]]
#> CDO command:
#> /home/user1/.local/share/R/rcdo/cdo-2.5.1/bin/cdo deltat [ '/tmp/Rtmp9eAVmx/filee0a5e44a2b5a712.nc' ] {{output}}
To execute a list of operations, use
cdo_execute_list()
.
mon_split |>
lapply(cdo_deltat) |>
cdo_execute_list()
#> [[1]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e2a5212ea"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#>
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[2]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e7ec35c5"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#>
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[3]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e3048379e"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#>
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[4]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5ef5e0588"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#>
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[5]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e10de3ecb"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#>
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[6]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e3c103371"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#>
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[7]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e16d6377f"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#>
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[8]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e56fd5ceb"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#>
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[9]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e9ebb1b8"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#>
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:32 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[10]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e6b70a80a"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#>
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[11]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5ec9c1f12"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#>
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[12]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e11dec2fd"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#>
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
We could’ve also have executed each operation inside the
lapply
call. However, because of the way
lapply
combines outputs, the ephemeral files will be
deleted. So you need to either use cdo_execute_list()
or
take care of explicitly creating temporary files.
mon_split |>
lapply(function(x) cdo_deltat(x) |> cdo_execute(output = tempfile()))
#> [[1]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e17bdb9ec"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[2]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e617c3903"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[3]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e27553394"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[4]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e7e6fdf0b"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[5]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e437efdf4"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[6]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e280ffaa8"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[7]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e1898f2bc"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[8]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e72aeb36b"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[9]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e6cb2b04f"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[10]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e32470d4b"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[11]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e7f5c2b0f"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
#>
#> [[12]]
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e380afe7a"
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 59545
Finally, some operators take a vector of files and return a single
file. We can re-merge the list of files with
cdo_mergetime()
.
merged <- mon_split |>
cdo_mergetime() |>
cdo_execute()
merged |>
cdo_ntime() |>
cdo_execute()
#> [1] "24"
Because rcdo can chain operations, there is no need of executing the
individual operations and then merging. cdo_mergetime()
can
take a list of operations naturally, so we could do this
mon_split |>
lapply(cdo_deltat) |>
cdo_mergetime() |>
cdo_execute()
#> [1] "/tmp/Rtmp9eAVmx/filee0a5e79085d5b"
#> attr(,"ephemeral")
#> attr(,"ephemeral")[[1]]
#> File will be deleted when garbage collected
#>
#> attr(,"mtime")
#> [1] "2025-05-14 10:38:33 AEST"
#> attr(,"size")
#> [1] 295001
Important limitations
Important
The whole rcdo package and its documentation is built automatically from the CDO source. This comes with some limitations.
CDO operators are documented in “families” where all parameters are
documented together. Currently there is no way for the build process to
correctly attribute each parameter to the correct operator. This
unfortunately means that rcdo functions have every argument from a
particular family. For example, the functions
cdo_selindexbox()
and cdo_sellonlatbox()
both
have lon1
in their signature.
The conversion from rcdo function arguments to CDO command parameters
is pretty dumb. Argument names re only for the user convenience and are
not really used: function arguments are converted to CDO parameters
simply in the order they are defined in the function signature. Neither
are argument checked for validity. This means that
cdo_sellonlatbox(file, lon1 = 0)
will not return an error
even though the operator is missing other necessary arguments. Also
cdo_sellonlatbox(file, idx1 = 0)
is identical to
cdo_sellonlatbox(file, lon1 = 0)
.
Some CDO operators need named parameters. rcdo currently doesn’t know
how to deal with that, so you need to pass the names yourself. So
cdo_select(file, name = "temperature")
doesn’t work, you
need to do cdo_select(file, name = "name=temperature")
,
which is equivalent to
cdo_select(file, code = "name=temperature")
due to the
previously described limitation.
Some parameters need to be quoted. For example, the expression in the
expr
operator needs to be surrounded by quotes. Again, rcdo
doesn’t know this so you must “double quote” the argument yourself;
i.e. cdo_expr(file, "'t_celcius=t-273.15'")
.
Some documentation formatting might be incorrect. If you spot some part of the documentation that didn’t survive the conversion, open an issue!