This is a demonstration of the ggiraph
package’s interactive visualisation capability. Also, since patchwork
was recently put on CRAN, “side-by-side interactive plotting” just got easier!
I will perform the PCA, tSNE and UMAP dimensionality reduction on the iris
data.
Just so you know that I know:
Yes, iris
data is overused, but hey! It is only a quick coding demo!
Yes, I know about plotly
, it also has the side-by-side option, but it is very painful when there are multiple plotting aesthetics.
library(ggiraph)
library(patchwork)
library(tidyverse)
library(Rtsne)
library(umap)
## We:
## + convert the iris data to a tibble
## + remove one duplicated row
## + add unique identifier to each row
iris_tbl = iris %>%
as_tibble() %>%
distinct() %>%
tibble::rownames_to_column("id")
iris_tbl
## # A tibble: 149 x 6
## id Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## <chr> <dbl> <dbl> <dbl> <dbl> <fct>
## 1 1 5.1 3.5 1.4 0.2 setosa
## 2 2 4.9 3 1.4 0.2 setosa
## 3 3 4.7 3.2 1.3 0.2 setosa
## 4 4 4.6 3.1 1.5 0.2 setosa
## 5 5 5 3.6 1.4 0.2 setosa
## 6 6 5.4 3.9 1.7 0.4 setosa
## 7 7 4.6 3.4 1.4 0.3 setosa
## 8 8 5 3.4 1.5 0.2 setosa
## 9 9 4.4 2.9 1.4 0.2 setosa
## 10 10 4.9 3.1 1.5 0.1 setosa
## # … with 139 more rows
## Computing the PCA, TSNE and UMAP objects
iris_pca = prcomp(iris_tbl %>% select_if(is.numeric))
iris_tsne = Rtsne(iris_tbl %>% select_if(is.numeric))
iris_umap = umap(iris_tbl %>% select_if(is.numeric))
The main function we will use is geom_point_interactive
from the ggiraph
package. It behaves just like geom_point
, but with two additional aesthetics. The data_id
aesthetic is critical to link observations between plots and the tooltip
aesthetic is optional but nice to have when mouse over a point. After making these plots, the girafe
function using the same syntax in patchwork
will allow us to make a pretty interactive plot!
## Getting two reduced dimensions for each method
iris_plotdf = iris_tbl %>%
mutate(
pca_1 = iris_pca$x[,1],
pca_2 = iris_pca$x[,2],
tsne_1 = iris_tsne$Y[,1],
tsne_2 = iris_tsne$Y[,2],
umap_1 = iris_umap$layout[,1],
umap_2 = iris_umap$layout[,2]
)
p1 = iris_plotdf %>%
ggplot(aes(x = pca_1, y = pca_2,
colour = Species,
tooltip = id,
data_id = id)) +
geom_point_interactive() +
labs(title = "PCA")
p2 = iris_plotdf %>%
ggplot(aes(x = tsne_1, y = tsne_2,
colour = Species,
tooltip = id,
data_id = id)) +
geom_point_interactive() +
labs(title = "tSNE")
p3 = iris_plotdf %>%
ggplot(aes(x = umap_1, y = umap_2,
colour = Species,
tooltip = id,
data_id = id)) +
geom_point_interactive() +
labs(title = "UMAP")
p1 + p2 + p3
At this point, p1 + p2 + p3
are combined together as a static plot thanks to patchwork
. But making it iterative is just one line.
girafe(code = print(p1 + p2 + p3),
width_svg = 15, height_svg = 5)
For the tidyverse
fanatics (myself included), you will see that we have made three plots with similar structures and cringe a little bit. Here is another way to make similar plot by pivoting.
iris_long_plotdf = iris_plotdf %>%
tidyr::pivot_longer(cols = matches("1|2"),
names_to = c("method", "dim"),
names_sep = "_") %>%
tidyr::pivot_wider(names_from = dim,
values_from = value)
iris_long_plotdf
## # A tibble: 447 x 9
## id Sepal.Length Sepal.Width Petal.Length Petal.Width Species method `1`
## <chr> <dbl> <dbl> <dbl> <dbl> <fct> <chr> <dbl>
## 1 1 5.1 3.5 1.4 0.2 setosa pca -2.67
## 2 1 5.1 3.5 1.4 0.2 setosa tsne 13.6
## 3 1 5.1 3.5 1.4 0.2 setosa umap 13.9
## 4 2 4.9 3 1.4 0.2 setosa pca -2.70
## 5 2 4.9 3 1.4 0.2 setosa tsne 16.1
## 6 2 4.9 3 1.4 0.2 setosa umap 12.0
## 7 3 4.7 3.2 1.3 0.2 setosa pca -2.88
## 8 3 4.7 3.2 1.3 0.2 setosa tsne 16.1
## 9 3 4.7 3.2 1.3 0.2 setosa umap 11.9
## 10 4 4.6 3.1 1.5 0.2 setosa pca -2.74
## # … with 437 more rows, and 1 more variable: `2` <dbl>
p4 = iris_long_plotdf %>%
ggplot(aes(x = `1`, y = `2`,
colour = Species,
tooltip = id,
data_id = id)) +
geom_point_interactive() +
facet_grid(~method, scales = "free")
girafe(code = print(p4),
width_svg = 10, height_svg = 5)
sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Mojave 10.14.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] gdtools_0.2.1 umap_0.2.3.1 Rtsne_0.15 forcats_0.4.0
## [5] stringr_1.4.0 dplyr_0.8.3 purrr_0.3.3 readr_1.3.1
## [9] tidyr_1.0.0 tibble_2.1.3 ggplot2_3.2.1 tidyverse_1.3.0
## [13] patchwork_1.0.0 ggiraph_0.7.0
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.3 lubridate_1.7.4 lattice_0.20-38 utf8_1.1.4
## [5] assertthat_0.2.1 zeallot_0.1.0 digest_0.6.23 RSpectra_0.15-0
## [9] plyr_1.8.4 R6_2.4.1 cellranger_1.1.0 backports_1.1.5
## [13] reprex_0.3.0 evaluate_0.14 httr_1.4.1 pillar_1.4.2
## [17] rlang_0.4.2 lazyeval_0.2.2 uuid_0.1-2 readxl_1.3.1
## [21] rstudioapi_0.10 Matrix_1.2-18 reticulate_1.13 rmarkdown_1.18
## [25] labeling_0.3 htmlwidgets_1.5.1 munsell_0.5.0 broom_0.5.2
## [29] compiler_3.6.1 modelr_0.1.5 xfun_0.11 pkgconfig_2.0.3
## [33] askpass_1.1 systemfonts_0.1.1 base64enc_0.1-3 htmltools_0.4.0
## [37] openssl_1.4.1 tidyselect_0.2.5 fansi_0.4.0 crayon_1.3.4
## [41] dbplyr_1.4.2 withr_2.1.2 grid_3.6.1 nlme_3.1-142
## [45] jsonlite_1.6 gtable_0.3.0 lifecycle_0.1.0 DBI_1.0.0
## [49] magrittr_1.5 scales_1.1.0 cli_1.1.0 stringi_1.4.3
## [53] reshape2_1.4.3 farver_2.0.1 fs_1.3.1 xml2_1.2.2
## [57] vctrs_0.2.0 generics_0.0.2 tools_3.6.1 glue_1.3.1
## [61] hms_0.5.2 yaml_2.2.0 colorspace_1.4-1 rvest_0.3.5
## [65] knitr_1.26 haven_2.2.0