install.packages("tidyverse") # packages for data wrangling and plotting including dplyr, readr, tidyr, stringr, readxl, ggplot2, and others
install.packages("cowplot")
The basics of ggplot2
Workshop materials
Workshop materials can be found at https://github.com/vancleve/data2design_uk.
Download R and RStudio
If you haven’t already downloaded and installed R and RStudio, please do so now.
R
Rstudio
Getting a Project going in RStudio
RStudio cheatsheet
Create a new project in RStudio
Using a “project” allows us to keep all our files in one place and helps RStudio understand which files belong to which project
- Go to
File
->New Project
->New directory
->New project
- Select an easy to find directory
- Name your project something descriptive like “data2design_ggplot_tutorial”.
- Save this
qmd
file into your project directory.
Go to https://github.com/vancleve/data2design_uk/raw/refs/heads/main/ggplot_tutorial.qmd andSave As...
in your browser.
Packages
Packages are groups of different functions or “actions” that allows us to do special tasks. People write packages to help others with reproducibility of analyses.
We will be working with the plotting package ggplot2
and a set of packages for data wrangling called tidyverse
(https://www.tidyverse.org/). By installing just tidyverse
, we can get all those packages at once. We’ll also install a package, cowplot
, for helping with finessing and saving our plots.
Once installed, packages must be loaded into our “environment” (it is like bringing a pen if you are planning to write, not only buying it!). To load them we use the function library()
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readxl)
library(cowplot)
Attaching package: 'cowplot'
The following object is masked from 'package:lubridate':
stamp
Reading data
The function read_excel
from the package readxl
allows us to read excel files and read_csv
allows us to read csv (comma separated value) files. We will first load a dataset from Onstein et al. (2019)1 frugivory‐related traits and dispersal in the custard apple family. In the code below, we’ll load the data directly from Dryad (https://datadryad.org/dataset/doi:10.5061/dryad.2hd8b0s), which is a repository for archiving datasets, by first downloading the xlsx
file and then loading it with read_excel
. We’ll also simplify some of the column names too for clarity and turn some of the columns from log values into normal values.
= tempfile(fileext = ".xlsx")
tf ::curl_download("https://datadryad.org/downloads/file_stream/82494", tf)
curl
<-
fruits read_excel(tf, sheet = "Matrix for analysis", na = "NA") |>
rename(Taxon_tree = Species_tree, Taxon = Species_PROTEUS) |>
mutate(across(starts_with("Log_"), exp, .names = "Exp_{col}")) |>
rename_with(\(x) gsub("Exp_Log_", "", x, fixed = TRUE))
## Check what is inside the dataset, was it read fine?
fruits
# A tibble: 228 × 27
Taxon_tree Taxon Log_Fruit_length_avg Log_Fruit_width_avg
<chr> <chr> <dbl> <dbl>
1 Alphonsea_boniana_JB57 Alphon… NA NA
2 Alphonsea_elliptica_JB34 Alphon… 0.763 0.580
3 Alphonsea_kinabaluensis_JB85 Alphon… 0.415 0.279
4 Ambavia_gerrardii_PB Ambavi… 0.301 0.114
5 Anaxagorea_javanica_JB Anaxag… 0.544 0.544
6 Anaxagorea_luzonensis_JB Anaxag… 0.415 -0.222
7 Anaxagorea_phaeocarpa_0498 Anaxag… 0.505 0.301
8 Anaxagorea_silvatica_0113 Anaxag… 0.176 0.0414
9 Annickia_chlorantha_0976 Annick… 0.114 -0.155
10 Annickia_kummeriae_MWC7004 Annick… 0.342 0.0414
# ℹ 218 more rows
# ℹ 23 more variables: Log_Fruit_number_avg <dbl>, Log_Stipe_length_avg <dbl>,
# Log_Seed_length_avg <dbl>, Log_Seed_width_avg <dbl>,
# Log_Seed_number_avg <dbl>, Log_Height_max <dbl>, Bright <chr>,
# Pubescent <chr>, Defence <chr>, Fruit_type <chr>, Cauliflory <chr>,
# Moniliform <chr>, Dehiscence <chr>, Shrub <chr>, Conspicuousness <chr>,
# Fruit_length_avg <dbl>, Fruit_width_avg <dbl>, Fruit_number_avg <dbl>, …
There is a lot going on the data wrangling above that we won’t have time to cover today, but one thing to point now is the use of the pipe operator, |>
, above. The pipe operator helps up string together data wrangling commands; what’s on the left hand side of the pipe goes into the first argument of what’s on the right hand side. This means the above code starts with the read_excel
function that returns the raw table that is given the rename
function that then renames a couple of the columns and returns a modified table that is then passed to mutate
, etc, and find the results is saved in the fruits
variable.
Plotting and understanding our data
Basics of ggplot2
: bar plots
The ggplot2
package is built on the idea that graphics have can have a “grammar”, or set or rules, that specifies how they can and should be constructed. Implementing these rules not only makes creating graphics easier, but it makes such graphics consistent and clear. Hadley Wickham, the creator of ggplot2
, borrows this idea from the book, “The Grammar of Graphics”” by Wilkinson, Anand, and Grossman (2005)2. While this structure may seem a bit artificial at first, it makes creating graphics very modular and building up complex graphics much easier.
Our first plot will be one of the simplest, which is a bar plot. Let’s plot the number of species by their Fruit_type
.
ggplot(fruits) +
geom_bar(aes(x = Fruit_type))
The above ggplot
command has two pieces. The first is a call to ggplot
with the name of the data table. This command by itself creates a blank canvas onto which we can plot using data from the data table. The second piece is to “add” a layer to this canvas with +
, and that layer is a bar plot. The geom
part is short for geometry since all the graphical elements of different plot types are made up of geometric pieces. In the argument to geom_bar
, we give an “aesthetic mapping” with aes(x = Fruit_type)
, which says we want the x-axis to map to the Fruit_type
column of our data. Finally, geom_bar
builds bars automatically whose height is given by the number of rows in the data table with each x value.
This plot is a bit boring but we can spice it up with another aesthetic. Let’s try the color of the bar, which is known as the “fill”. We’ll set the fill to the Cauliflory
variable.
ggplot(fruits) +
geom_bar(aes(x = Fruit_type, fill = Cauliflory))
ggplot
is smart here and automatically adds a legend for the color aethetic since you otherwise wouldn’t know which color mapped to which value of Dehiscence
. In fact, ggplot
also automatically added the tick labels for the same reason for the x-axis aesthetic.
Grids of plots
Looking at our dataset, there are a few other fruit variables we can look at like Dehiscence
and Moniliform
. What if we wanted to see if how the number of each Fruit_type
varies as a function of different values of Dehiscence
and Moniliform
? One way to visualize this would be to replicate the bar plot above for each combination of Dehiscence
and Moniliform
. It turns out that ggplot
let’s us do this very easily by adding a facet_wrap
layer onto our bar plot.
ggplot(fruits) +
geom_bar(aes(x = Fruit_type, fill = Cauliflory)) +
facet_wrap(vars(Dehiscence, Moniliform))
What can we observe from this?
Notice that the y-axis scales are all the same. This is on purpose and allows us to compare the data across the panels of the plot. However, we miss the variation in the panels with fewer observations. To allow the y-axis scales to adjust in each panel, we can add scales = "free_y"
to facet_wrap
.
ggplot(fruits) +
geom_bar(aes(x = Fruit_type, fill = Cauliflory)) +
facet_wrap(vars(Dehiscence, Moniliform), scales = "free_y")
Scatter plots
When do we use scatter plots? Scatter plots are great for looking at how variables are correlated to one another and for seeing the full distribution of the data since you typically plot every single point.
Let’s make our first basic scatter plot. Notice that we put the aes
command as an argument to the ggplot
command. This works just as well as putting it in the geom_point
command except that we can add on additional geometries without having to repeat the aes
command.
ggplot(fruits, aes(x=Fruit_length_avg, y=Fruit_width_avg)) +
geom_point()
Warning: Removed 24 rows containing missing values or values outside the scale range
(`geom_point()`).
What do we notice?
Now let’s add a regression (linear model) line through the points.
ggplot(fruits, aes(x = Fruit_length_avg, y = Fruit_width_avg)) +
geom_point() +
geom_smooth(method="lm")
`geom_smooth()` using formula = 'y ~ x'
Warning: Removed 24 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 24 rows containing missing values or values outside the scale range
(`geom_point()`).
The geom_smooth
function is what adds the linear regression line. Note here we have to add method="lm"
to tell geom_smooth
to add a straight line (i.e., linear); otherwise, it defaults to a more complex method that fits a smooth curve.
ggplot(fruits, aes(x = Fruit_length_avg, y = Fruit_width_avg)) +
geom_point() +
geom_smooth()
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Warning: Removed 24 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 24 rows containing missing values or values outside the scale range
(`geom_point()`).
By adding a color aesthetic, the points are colored based on the value of the variable we map to color. Here, let’s map Shrub
to color and keep the linear regression.
<- ggplot(fruits, aes(x = Fruit_length_avg, y = Fruit_width_avg, color = Shrub)) +
fruits_plt geom_point() +
geom_smooth(method = lm, se = FALSE, fullrange = TRUE)
fruits_plt
`geom_smooth()` using formula = 'y ~ x'
Warning: Removed 24 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 24 rows containing missing values or values outside the scale range
(`geom_point()`).
Are the groups different?
Volcano plots
Scatter plots can be used to visualize all kinds of data and are especially population in genomics data. For example, differential expression analyses using RNA-seq are very common and a “volcano plot” is often used to show which genes have increased and which have decreased expression in an experiment. We’ll use a dataset common in RNA-seq differential expression tutorials3, airway
, which comes from an RNA-Seq experiment on four human airway smooth muscle cell lines treated with dexamethasone4. The data from airway
can be loaded from the workshop GitHub site.
= read_csv("https://raw.githubusercontent.com/vancleve/data2design_uk/refs/heads/main/airway_de_transcript.csv") airway_de
Rows: 15926 Columns: 19
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (7): .feature, albut, gene_id, gene_name, gene_biotype, seq_name, symbol
dbl (8): gene_seq_start, gene_seq_end, seq_strand, logFC, logCPM, F, PValue,...
lgl (4): entrezid, seq_coord_system, GRangesList, .abundant
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
airway_de
# A tibble: 15,926 × 19
.feature albut gene_id gene_name entrezid gene_biotype gene_seq_start
<chr> <chr> <chr> <chr> <lgl> <chr> <dbl>
1 ENSG00000000003 untrt ENSG000… TSPAN6 NA protein_cod… 99883667
2 ENSG00000000419 untrt ENSG000… DPM1 NA protein_cod… 49551404
3 ENSG00000000457 untrt ENSG000… SCYL3 NA protein_cod… 169818772
4 ENSG00000000460 untrt ENSG000… C1orf112 NA protein_cod… 169631245
5 ENSG00000000971 untrt ENSG000… CFH NA protein_cod… 196621008
6 ENSG00000001036 untrt ENSG000… FUCA2 NA protein_cod… 143815948
7 ENSG00000001084 untrt ENSG000… GCLC NA protein_cod… 53362139
8 ENSG00000001167 untrt ENSG000… NFYA NA protein_cod… 41040684
9 ENSG00000001460 untrt ENSG000… STPG1 NA protein_cod… 24683489
10 ENSG00000001461 untrt ENSG000… NIPAL3 NA protein_cod… 24742284
# ℹ 15,916 more rows
# ℹ 12 more variables: gene_seq_end <dbl>, seq_name <chr>, seq_strand <dbl>,
# seq_coord_system <lgl>, symbol <chr>, GRangesList <lgl>, .abundant <lgl>,
# logFC <dbl>, logCPM <dbl>, F <dbl>, PValue <dbl>, FDR <dbl>
Each row of the table is gene whose expression was measured in the experiment, logFC
is the log2 of the ratio of the expression in the treatment vs the control, PValue
is the significance of the comparison, and FDR
is false discovery rate.
Plotting the volcano plot is as simple as plotting the logFC
on the x-axis and the -log10(Pvalue)
on the y-axis. We can color the points by those with a FDR < 0.05.
ggplot(airway_de, aes(x = logFC, y = -log10(PValue), colour = FDR < 0.05)) +
geom_point()
What if we want to color points as well by those with a greater than 2 fold change in expression up or down? Then we can do a little wrangling to create a new signficance
column and use that for the color.
= airway_de |>
airway_de_sig mutate(significance =
case_when(
< 0.05 & abs(logFC) >= 2 ~ "significant FDR and FC",
FDR < 0.05 & abs(logFC) < 2 ~ "significant FDR",
FDR > 0.05 & abs(logFC) >= 2 ~ "significant FC",
FDR .default = "non-significant"
))
ggplot(airway_de_sig, aes(x = logFC, y = -log10(PValue), colour = significance)) +
geom_point()
Finally, what about labeling some of the genes with the most significant (i.e., lowest) p-values? To do this, we need a little more data wrangling and a new geometry called geom_text
that places a label on top of a data point.
=
airway_de_sig_lbl |>
airway_de_sig mutate(label = ifelse(PValue < 5e-9, gene_name, ""))
= ggplot(airway_de_sig_lbl, aes(x = logFC, y = -log10(PValue))) +
airway_plt geom_point(aes(colour = significance)) +
geom_text(aes(label = label))
airway_plt
This looks great except the labels are right on top of the points and they overlap. Maybe we also want to modify the background or add horizontal or vertical lines separating the regions? We can do all these things in Adobe Illustrator! We will save this plot later so we can load it into Illustrator.
ggsave("airway_de_plot.pdf", plt, width = 20, height = 12)
Plotting distributions
Often we are interested in understanding how often observations of a certain size appear in our data. For example, how many species have fruit of a certain length? To answer questions like this, we need to visualize the distribution of the data.
Questions about distributions often come up when our variables are contrinous or metric, which means that we can add and subtract values of the variable and create summaries of the variable like its mean value.
Histograms
One of the most common ways of visualizing a distribution is with a histogram, which groups observations into bins (or cajitas) to show us which kinds of observations are more frequent.
Let’s make a basic histogram!
ggplot(fruits, aes(x=Fruit_length_avg)) +
geom_histogram()
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Warning: Removed 15 rows containing non-finite outside the scale range
(`stat_bin()`).
Let’s interpret it!
Making histograms pretty and more useful
We saw how to change the fill of bars in a bar chart, but we can also change the color of the surrounding box. Here’s how:
<- ggplot(fruits, aes(x = Fruit_length_avg)) +
p_histogram geom_histogram(color = "darkblue", fill = "hotpink")
p_histogram
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Warning: Removed 15 rows containing non-finite outside the scale range
(`stat_bin()`).
There are some pesky NAs
that are causing the warnings above. What happens in the histogram if we remove them? For that we will use the function filter
and save the sliced data into fruits_mod
.
<- filter(fruits, !is.na(Fruit_length_avg))
fruits_mod
<- ggplot(fruits_mod, aes(x = Fruit_length_avg)) +
p_histogram geom_histogram(color = "darkblue", fill = "hotpink")
p_histogram
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
What if we want to separate the data by the type of plant? We can use the variable Shrub
in the dataset that tells us which one is a shrub or not a shrub
<-ggplot(fruits_mod, aes(x = Fruit_length_avg, color = Shrub)) +
p_histogramgeom_histogram(fill = "white")
p_histogram
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
What if we want to add the mean of each group? We can first calculate the mean like below
## Calculating the mean per group using Dplyr
## we are naming the mean of each group mean_fl (as a short for mean fruit length)
<- fruits_mod |>
mu filter(!is.na(Shrub)) |>
group_by(Shrub) |>
summarize(mean_fl = mean(Fruit_length_avg, na.rm = TRUE),
.groups = 'drop')
mu
# A tibble: 2 × 2
Shrub mean_fl
<chr> <dbl>
1 not_shrub 1.62
2 shrub 1.56
Ok, so what is happening with the dimension of mu? what do you notice?
dim(mu)
[1] 2 2
Okay so now finally add the mean to the groups and the histogram
<- ggplot(fruits_mod, aes(x = Fruit_length_avg, color = Shrub)) +
p_histogram geom_histogram(fill = "white", position = "dodge") +
geom_vline(data = mu, aes(xintercept = mean_fl, color = Shrub),
linetype = "dashed") +
theme(legend.position = "top")
p_histogram
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
What if I want my own colors? Let’s check some cool options (aka coolors).
<- p_histogram +
p_histogram scale_color_manual(values = c("hotpink", "#56B4E9")) #EXPLAIN
p_histogram
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
But I want to make it publication quality! Let’s make nice labels on the axes and clean the background
<- p_histogram + xlab("Average Fruit Length")
p_histogram p_histogram
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
<- p_histogram + theme_classic()
p_histogram p_histogram
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Ok, what about a histogram for our airway
differential gene expression data? Let’s take a look at the distribution of p-values in the data.
ggplot(airway_de) +
geom_histogram(aes(x = PValue), fill = "gray", color = "black") +
theme_classic()
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
What should this distribution look like under the null hypothesis that there is no effect of the drug on gene expression?
Density plots
What are density plots? They are areas under the curve and later they will help us to decide about probability!
Let’s make a simple one!
<- ggplot(fruits_mod, aes(x = Fruit_length_avg)) +
p_density geom_density()
p_density
Let’s give it some pretty colors and add some transparency or reduce the opacity (called alpha
).
<- ggplot(fruits_mod, aes(x = Fruit_length_avg)) +
p_density geom_density(color = "darkblue", fill = "lightblue", alpha = 0.5)
p_density
We will add the mean.
<- mean(fruits_mod$Fruit_length_avg, na.rm = TRUE)
mean.fruit mean.fruit
[1] 1.596013
<- p_density +
p_density geom_vline(aes(xintercept = mean.fruit),
color = "blue", linetype = "dashed", linewidth = 1)
p_density
Change density plot line colors by groups
<- ggplot(fruits_mod, aes(x=Fruit_length_avg, color=Shrub)) +
p_density geom_density() + geom_vline(data=mu, aes(xintercept=mean_fl, color=Shrub),
linetype="dashed")
p_density
# Fill them in
<- ggplot(fruits_mod, aes(x = Fruit_length_avg, fill = Shrub)) +
p_density geom_density(alpha = 0.4) +
geom_vline(data = mu, aes(xintercept = mean_fl, color = Shrub), linetype = "dashed")
p_density
Adding my color scheme (tolk about coolors here)
<- p_density +
p_density scale_fill_manual(values = c("hotpink", "#56B4E9"))
p_density
Making it publication style
<- p_density +
p_density xlab("Average Fruit Length") +
theme_classic()
p_density
Let’s now do a density plot for the log fold change for the gene expression data and looking at the distribution for FDR < 0.05 and FDR < 0.05. What do we see?
ggplot(airway_de) +
geom_density(aes(x = logFC, fill = FDR < 0.05), alpha = 0.4) +
scale_fill_manual(values = c("hotpink", "#56B4E9")) +
theme_classic()
Histogram with density plot
<- ggplot(fruits_mod, aes(x = Fruit_length_avg)) +
p_histdensity geom_histogram(aes(y = after_stat(density)), colour = "black", fill = "white")+
geom_density(alpha = .2, fill = "hotpink")
p_histdensity
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Making Box and Whisker plots
What are box and whisker plots? What are quartiles and interquartile difference??
ggplot(data = fruits_mod, aes(x = Fruit_length_avg)) +
geom_boxplot()
How do we make it vertical???
<- ggplot(data = fruits_mod, aes(y = Fruit_length_avg)) +
p_boxplot geom_boxplot()
p_boxplot
How do we make it box and whisker plots by group??
<- ggplot(fruits_mod, aes(y = Fruit_length_avg, color = Shrub)) +
p_boxplot geom_boxplot()
p_boxplot
How do we change the color?
<- ggplot(fruits_mod, aes(x = Shrub, y = Fruit_length_avg, color = Shrub)) +
p_boxplot geom_boxplot() +
scale_color_manual(values = c("hotpink", "#56B4E9"))
p_boxplot
Notice that the x-axis tick labels updated, great! But now the legend is redundant, so let’s turn it off.
<- ggplot(fruits_mod, aes(x = Shrub, y = Fruit_length_avg, color = Shrub)) +
p_boxplot geom_boxplot(show.legend = FALSE) +
scale_color_manual(values = c("hotpink", "#56B4E9"))
p_boxplot
How do we fill them in
<- ggplot(fruits_mod, aes(x = Shrub, y = Fruit_length_avg, fill = Shrub)) +
p2 geom_boxplot(show.legend = FALSE)
+ scale_fill_manual(values = c("hotpink", "#56B4E9")) p2
How do we add the mean?
<- p_boxplot +
p_boxplot geom_point(data = mu, aes(x = Shrub, y = mean_fl), shape = 23, size = 4, show.legend = FALSE)
p_boxplot
Adding good labels
<- p_boxplot +
p_boxplot xlab("Shrub status") +
ylab("Average fruit length")
p_boxplot
How do we add all the samples?
<- p_boxplot +
p_boxplot geom_jitter(shape = 16, position = position_jitter(0.2), show.legend = FALSE)
p_boxplot
All pretty and ready for publication
<- p_boxplot + theme_classic()
p_boxplot p_boxplot
Box plots are useful for all kinds of data; let’s create a boxplot for the differential gene expression data that looks at the log fold change as a function of the FDR.
ggplot(airway_de) +
geom_boxplot(aes(y = logFC, x = FDR < 0.05, color = FDR < 0.05), alpha = 0.4, show.legend = FALSE) +
scale_color_manual(values = c("hotpink", "#56B4E9")) +
theme_classic()
Finally, let’s say two of our plots for modification in Adobe Illustrator
= plot_grid(fruits_plt, airway_plt) # plot_grid comes from library(cowplot)
plts
plts
save_plot("Fruit_length_avg_v_airway_de_sig_lbl.pdf", plts, ncol = 2, base_width = 6, base_height = 5) # save_plot comes from library(cowplot)
Futher information
There are some recent books on data science and visualization (all written in RMarkdown
, which is a predecessor and alternative to Quarto
) that cover much of the material in the course.
- Wickham, Hadley, Grolemund, Garrett, and Mine Çetinkaya-Rundel. 2023. R for Data Science (2e). O’Reilly. < https://r4ds.hadley.nz/>
- Wilke, Claus O. 2018. Fundamentals of Data Visualization. https://clauswilke.com/dataviz/
- Healy, Kieran. 2018. Data Visualization: A Practical Introduction. http://socviz.co/
- Ismay, Chester and Kim, Albert Y. 2018. An Introduction to Statistical and Data Sciences via R. https://moderndive.com/
- Silge, Julia and Robinson, David. 2018. Text Mining with R: A Tidy Approach. https://www.tidytextmining.com/
If you want to become an R wizard in the style of Hadley Wickham, this book is for you.
- Wickham, Hadley. 2019. Advanced R. https://adv-r.hadley.nz/
Footnotes
Onstein, R. E., W. D. Kissling, L. W. Chatrou, T. L. P. Couvreur, H. Morlon, and H. Sauquet. 2019. Which frugivory-related traits facilitated historical long-distance dispersal in the custard apple family (Annonaceae)? Journal of Biogeography 46:1874–1888.↩︎
Wilkinson, L. 2005. The grammar of graphics. Statistics and computing. Springer New York.↩︎
https://stemangiola.github.io/rpharma2020_tidytranscriptomics/articles/tidytranscriptomics.html↩︎
Himes, B. E., X. Jiang, P. Wagner, R. Hu, Q. Wang, B. Klanderman, R. M. Whitaker, et al. 2014. RNA-Seq Transcriptome Profiling Identifies CRISPLD2 as a Glucocorticoid Responsive Gene that Modulates Cytokine Function in Airway Smooth Muscle Cells. PLOS ONE 9:e99625.↩︎