Produces very dissimilar datasets with the same statistical properties.
metamerise( data, preserve, minimize = NULL, change = colnames(data), round = truncate_to(2), stop_if = n_tries(100), keep = NULL, annealing = TRUE, K = 0.02, start_probability = 0.5, perturbation = 0.08, name = "", verbose = interactive() ) metamerize( data, preserve, minimize = NULL, change = colnames(data), round = truncate_to(2), stop_if = n_tries(100), keep = NULL, annealing = TRUE, K = 0.02, start_probability = 0.5, perturbation = 0.08, name = "", verbose = interactive() ) new_metamer(data, preserve, round = truncate_to(2))
A function whose result must be kept exactly the same. Must take the data as argument and return a numeric vector.
An optional function to minimize in the process. Must take the data as argument and return a single numeric.
A character vector with the names of the columns that need to be changed.
A function to apply to the result of
A stopping criterium. See stop-conditions.
Max number of metamers to return.
Logical indicating whether to perform annealing.
speed/quality tradeoff parameter.
initial probability of rejecting bad solutions.
Numeric with the magnitude of the random perturbations.
Can be of length 1 or
Character for naming the metamers.
Logical indicating whether to show a progress bar.
metamer_list object (a list of data.frames).
It follows Matejka & Fitzmaurice (2017) method of constructing metamers.
Beginning from a starting dataset, it iteratively adds a small perturbation,
preserve returns the same value (up to
signif significant digits)
minimize has been lowered, and accepts the solution for the next
TRUE, it also accepts solutions with bigger
minimize with an ever decreasing probability to help the algorithm avoid
The annealing scheme is adapted from de Vicente et al. (2003).
data is a
metamer_list, the function will start the algorithm from the
last metamer of the list. Furthermore, if
are missing, the previous functions will be carried over from the previous call.
minimize can be also a vector of functions. In that case, the process minimizes
the product of the functions applied to the data.
Matejka, J., & Fitzmaurice, G. (2017). Same Stats, Different Graphs. Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems - CHI ’17, 1290–1294. https://doi.org/10.1145/3025453.3025912 de Vicente, Juan, Juan Lanchares, and Román Hermida. (2003). ‘Placement by Thermodynamic Simulated Annealing’. Physics Letters A 317(5): 415–23.
delayed_with() for a convenient way of making functions suitable for
mean_dist_to() for a convenient way of minimizing the distance
to a known target in
mean_self_proximity() for maximizing the
"self distance" to prevent data clumping.
data(cars) # Metamers of `cars` with the same mean speed and dist, and correlation # between the two. means_and_cor <- delayed_with(mean_speed = mean(speed), mean_dist = mean(dist), cor = cor(speed, dist)) set.seed(42) # for reproducibility. metamers <- metamerize(cars, preserve = means_and_cor, round = truncate_to(2), stop_if = n_tries(1000)) print(metamers)#> List of 92 metamerslast <- tail(metamers) # Confirm that the statistics are the same cbind(original = means_and_cor(cars), metamer = means_and_cor(last))#> original metamer #> mean_speed 15.4000000 15.4024926 #> mean_dist 42.9800000 42.9862221 #> cor 0.8068949 0.8011081points(cars, col = "red")