The metamer package implements Matejka y Fitzmaurice (2017) algorithm for generating datasets with distinct appearance but identical statistical properties. I propose to call them “metamers” as an analogy with the colorimetry concept.
Today I’m extremely happy because I’ve finally been able to fulfil a dream of mine. And yes, by the end of this blogpost you might be worried about me for having such a weird, niche and, frankly, dumb dream, but I swear I’m fine!
My dream was to create an R wrapper for the Climate Data Operators (CDO) automatically from its documentation.
CDO is a command-line utility that provides a plethora of functionality common to climate science.
Say you have data measured at different weather stations, which in Argentina might look something like this
estaciones[data, on = c("nombre" = "station")] |> ggplot(aes(lon, lat)) + geom_point(aes(color = t)) + geom_sf(data = argentina_provincias, inherit.aes = FALSE, fill = NA) + scale_color_viridis_c() Because this is not a regular grid, it’s not possible to visualise this data with contours as is. Instead, it’s necessary to interpolate it into a regular grid.
One of the recurring debates in some spaces of the R community is about dependencies. After a few posts on Mastodon I wanted to capture my opinions on the subject to help me understand them better, and because long-form articles are much better to talk about contentious topics than short-burst posts.
Dependencies are invitations for other people to collaborate with you Many thinkers have marvelled at the magic inside books.
For a while I wanted to write a post to compile some of the tricks I’ve learnt over the years of using rmarkdown. I also wanted other people’s input so I asked for suggestions on Mastodon. So here are the 11 tips I decided to include in no particular order.
Make chunk options non-optional I use this trick to force myself to write captions to all figures:
knit_plot <- knitr::knit_hooks$get("plot") knitr::knit_hooks$set(plot = function(x, options) { if (is.
An important part of a scientific project, such as a journal paper or a PhD thesis, is accessing datasets. To keep things reproducible datasets should be accessible, either provided in the repository itself or in a remote location. Also for reproducibility, it’s important to be able to check if the data you get is the same as the data you expect.
I wanted to share my technique for downloading and accessing datasets that strives for maximum reproducibility and user-friendliness.
ChatGPT seems to be taking the world by storm. This is version of the GPT3 language model which is somehow optimised for chat dominates my Mastodon feed and inspired countless articles and discussion. 1
A decent chunk of the discourse has been about how the outputs of the models sound very plausible and even authoritative but lack any connection with reality because the model is train to mimic language, not to give truth.
I started to use R full time for my research about 5 years ago when I started working on my Masters’ thesis and up until today there was one thing missing: proper contour labels. Now, thanks to the wonderful isoband package, I finally got what I wished for and it’s bundled in the latest release of metR.
So let’s set up the stage for the problem. I have a 2D field that I want to visualise as a contour map.
For my PhD I’m currently writing a paper using rmarkdown. Since I care about reproducibility, I’m using renv to register the versions of the R packages I use and to manage a local library that doesn’t affect the rest of my system. With that, anyone who wants to reproduce my work could download all the code, run renv::restore() and have an R environment very similar to the one I use.
The stop() function allows you to terminate the execution of a function if there is a fatal problem.
For example, imagine this code that calculates the square root of a number but only if the input number is positive.
real_root <- function(x) { if (x < 0) { stop("'x' cannot be negative.") } sqrt(x) } real_root(2) ## [1] 1.414214 real_root(-2) ## Error in real_root(-2): 'x' cannot be negative. If x is negative, the function throws an error.
R 4.1.0 is out! And if version 4.0.0 made history with the revolutionary change of stringAsFactors = FALSE, the big splashing news in this next version is the implementation of a native pipe.
The new pipe The “pipe” is one of the most distinctive qualities of tidyverse/dplyr code. I’m sure you’ve used or seen something like this:
library(dplyr) mtcars %>% group_by(cyl) %>% summarise(mpg = mean(mpg)) ## # A tibble: 3 x 2 ## cyl mpg ## <dbl> <dbl> ## 1 4 26.