---
title: "Customize plots of incidence"
author: "Thibaut Jombart, Zhian N. Kamvar"
date: "`r Sys.Date()`"
output:
rmarkdown::html_vignette:
toc: true
toc_depth: 4
vignette: >
%\VignetteIndexEntry{Customise graphics}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.width=7,
fig.height=5
)
```
This vignette provides some tips for the most common customisations of graphics produced by
`plot.incidence`. Our graphics use *ggplot2*, which is a distinct graphical system from base
graphics. If you want advanced customisation of your incidence plots, we recommend following an
introduction to *ggplot2*.
# Example data: simulated Ebola outbreak
This example uses the simulated Ebola Virus Disease (EVD) outbreak from the package
[*outbreaks*](https://cran.r-project.org/package=outbreaks): `ebola_sim_clean`.
First, we load the data:
```{r, data}
library(outbreaks)
library(ggplot2)
library(incidence)
onset <- ebola_sim_clean$linelist$date_of_onset
class(onset)
head(onset)
```
We compute the weekly incidence:
```{r, incid1}
i <- incidence(onset, interval = 7)
i
i.sex <- incidence(onset, interval = 7, group = ebola_sim_clean$linelist$gender)
i.sex
i.hosp <- incidence(onset, interval = 7, group = ebola_sim_clean$linelist$hospital)
i.hosp
```
# The `plot.incidence` function
When calling `plot` on an *incidence* object, the function `plot.incidence` is implicitly used. To access its documentation, use `?plot.incidence`. In this section, we illustrate existing customisations.
## Default behaviour
By default, the function uses grey for single time series, and colors from the color palette `incidence_pal1` when incidence is computed by groups:
```{r, default}
plot(i)
plot(i.sex)
plot(i.hosp)
```
However, some of these defaults can be altered through the various arguments of the function:
```{r, args}
args(incidence:::plot.incidence)
```
## Changing colors
### The default palette
A color palette is a function which outputs a specified number of colors. By
default, the color used in *incidence* is called `incidence_pal1`. Its
behaviour is different from usual palettes, in the sense that the first 4
colours are not interpolated:
```{r, incidence_pal1, fig.height = 8}
par(mfrow = c(3, 1), mar = c(4,2,1,1))
barplot(1:2, col = incidence_pal1(2))
barplot(1:4, col = incidence_pal1(4))
barplot(1:20, col = incidence_pal1(20))
```
This palette also has a light and a dark version:
```{r, pal2, fig.height = 8}
par(mfrow = c(3,1))
barplot(1:20, col = incidence_pal1_dark(20), main = "palette: incidence_pal1_dark")
barplot(1:20, col = incidence_pal1(20), main = "palette: incidence_pal1")
barplot(1:20, col = incidence_pal1_light(20), main = "palette: incidence_pal1_light")
```
### Using different palettes
Other color palettes can be provided via `col_pal`. Various palettes are part of the base R distribution, and many more are provided in additional packages. We provide a couple of examples:
```{r, palettes}
plot(i.hosp, col_pal = rainbow)
plot(i.sex, col_pal = cm.colors)
```
### Specifying colors manually
Colors can be specified manually using the argument `color`; note that whenever incidence is computed by groups, the number of colors must match the number of groups, otherwise `color` is ignored.
#### Example 1: changing a single color
```{r, colors1}
plot(i, color = "darkred")
```
#### Example 2: changing several colors (note that naming colors is optional)
```{r, colors2}
plot(i.sex, color = c(m = "orange2", f = "purple3"))
```
#### Example 3: using color to highlight specific groups
```{r, colors3}
plot(i.hosp,
color = c("#ac3973", "#6666ff", "white", "white", "white", "white"))
```
# Useful *ggplot2* tweaks
Numerous tweaks for *ggplot2* are documented online. In the following, we merely provide a few useful tips in the context of *incidence*.
## Changing dates on the *x*-axis
### Changing date format
By default, the dates indicated on the *x*-axis of an incidence plot may not
have the suitable format.
The package *scales* can be used to change the way dates are labeled (see
`?strptime` for possible formats):
```{r, scales1}
library(scales)
plot(i, labels_week = FALSE) +
scale_x_date(labels = date_format("%d %b %Y"))
```
Notice how the labels are all situated at the first of the month? If you want to
make sure the labels are situated in a different orientation, you can use the
`make_breaks()` function to calculate breaks for the plot:
```{r scales_breaks}
b <- make_breaks(i, labels_week = FALSE)
b
plot(i) +
scale_x_date(breaks = b$breaks,
labels = date_format("%d %b %Y"))
```
And for another example, with a subset of the data (first 50 weeks), using more
detailed dates and rotating the annotations:
```{r, scales2}
plot(i[1:50]) +
scale_x_date(breaks = b$breaks, labels = date_format("%a %d %B %Y")) +
theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 12))
```
Note that you can save customisations for later use:
```{r, scales3}
rotate.big <- theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 12))
```
### Changing the grid
The last example above illustrates that it can be useful to have denser
annotations of the *x*-axis, especially over short time periods. Here, we
provide an example where we try to zoom on the peak of the epidemic, using the
data by hospital:
```{r, grid1}
plot(i.hosp)
```
Let us look at the data 40 days before and after the 1st of October:
```{r, grid2}
period <- as.Date("2014-10-01") + c(-40, 40)
i.zoom <- subset(i.hosp, from = period[1], to = period[2])
detailed.x <- scale_x_date(labels = date_format("%a %d %B %Y"),
date_breaks = "2 weeks",
date_minor_breaks = "week")
plot(i.zoom, border = "black") + detailed.x + rotate.big
```
### Handling non-ISO weeks
If you have weekly incidence that starts on a day other than monday, then the
above solution may produce breaks that fall inside of the bins:
```{r, saturday-epiweek}
i.sat <- incidence(onset, interval = "1 week: saturday", groups = ebola_sim_clean$linelist$hospital)
i.szoom <- subset(i.sat, from = period[1], to = period[2])
plot(i.szoom, border = "black") + detailed.x + rotate.big
```
In this case, you may want to either calculate breaks using `make_breaks()` or
use the `scale_x_incidence()` function to automatically calculate these for you:
```{r, saturday-epiweek2}
plot(i.szoom, border = "black") +
scale_x_incidence(i.szoom, n_breaks = nrow(i.szoom)/2, labels_week = FALSE) +
rotate.big
```
```{r, saturday-epiweek3}
sat_breaks <- make_breaks(i.szoom, n_breaks = nrow(i.szoom)/2)
plot(i.szoom, border = "black") +
scale_x_date(breaks = sat_breaks$breaks, labels = date_format("%a %d %B %Y")) +
rotate.big
```
### Labelling every bin
Sometimes you may want to label every bin of the incidence object. To do this,
you can simply set `n_breaks` to the number of rows in your incidence object:
```{r label-bins}
plot(i.szoom, n_breaks = nrow(i.szoom), border = "black") + rotate.big
```
## Changing the legend
The previous plot has a fairly large legend which we may want to move around.
Let us save the plot as a new object `p` and customize the legend:
```{r, legend1}
p <- plot(i.zoom, border = "black") + detailed.x + rotate.big
p + theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 12),
legend.position = "top", legend.direction = "horizontal",
legend.title = element_blank())
```
## Applying the style of European Programme for Intervention Epidemiology Training (EPIET)
### Display individual cases
For small datasets it is convention of EPIET to display individual cases as
rectangles. It can be done by doing two things: first, adding using the option
`show_cases = TRUE` with a white border and second, setting the background to
white. We also add `coord_equal()` which forces each case to be a square.
```{r, EPIET1}
i.small <- incidence(onset[160:180])
plot(i.small, border = "white", show_cases = TRUE) +
theme(panel.background = element_rect(fill = "white")) +
rotate.big +
coord_equal()
```