Skip to contents

Overview

surreal implements the “Residual (Sur)Realism” algorithm described by Stefanski (2007). This package allows you to generate datasets that reveal hidden images or messages in their residual plots, providing a novel approach to understanding and illustrating statistical concepts.

Installation

You can install the development version of surreal from GitHub with:

# install.packages("remotes")
remotes::install_github("coatless-rpkg/surreal")

Usage

First, load the package:

We can take an image with x and y coordinate positions for pixels and embed it into the residual plot.

Importing Data

As an example, let’s use the built-in R logo dataset:

data("r_logo_image_data", package = "surreal")

plot(r_logo_image_data, pch = 16, main = "Original R Logo Data")

The data is in a 2D format:

str(r_logo_image_data)
#> 'data.frame':    2000 obs. of  2 variables:
#>  $ x: int  54 55 56 57 58 59 34 35 36 49 ...
#>  $ y: int  -9 -9 -9 -9 -9 -9 -10 -10 -10 -10 ...
summary(r_logo_image_data)
#>        x                y         
#>  Min.   :  5.00   Min.   :-75.00  
#>  1st Qu.: 32.00   1st Qu.:-57.00  
#>  Median : 57.00   Median :-39.00  
#>  Mean   : 55.29   Mean   :-40.48  
#>  3rd Qu.: 77.00   3rd Qu.:-24.00  
#>  Max.   :100.00   Max.   : -9.00

Applying the Surreal Method

Now, let’s apply the surreal method:

set.seed(114)
transformed_data <- surreal(r_logo_image_data)

The transformation adds predictors that appear to have no underlying patterns:

pairs(y ~ ., data = transformed_data, main = "Data After Transformation")

Revealing the Hidden Image

Fit a linear model to the transformed data and plot the residuals:

model <- lm(y ~ ., data = transformed_data)
plot(model$fitted, model$resid, pch = 16, 
     main = "Residual Plot: Hidden R Logo Revealed")

The residual plot reveals the original R logo with a slight border, enhancing the image recovery.

Creating Custom Hidden Images

[!IMPORTANT]

This function is unable to work on Windows as the version of GhostScript included with R does not support the ppm type.

You can also create datasets with custom hidden images or text. Here’s a quick example using text:

text_data <- surreal_text("R\nis\nawesome!")
model <- lm(y ~ ., data = text_data)
plot(model$fitted, model$resid, pch = 16, main = "Custom Text in Residuals")

References

Stefanski, L. A. (2007). “Residual (Sur)realism”. The American Statistician, 61(2), 163-177. doi:10.1198/000313007X190079

Acknowledgements

This package builds upon the work of John Staudenmayer, Peter Wolf, and Ulrike Gromping, who initially brought these algorithms to R.