tlf-ae-specific.qmd

# Specific AE

```{r, include=FALSE}
source("_common.R")
```

Following [ICH E3 guidance](https://database.ich.org/sites/default/files/E3_Guideline.pdf),
we need to summarize number of participants for each specific AE in Section 12.2, Adverse Events (AEs).

```{r}
library(haven) # Read SAS data
library(dplyr) # Manipulate data
library(tidyr) # Manipulate data
library(r2rtf) # Reporting in RTF format
```

In this chapter, we illustrate how to summarize simplified specific AE
information for a study.

```{r, out.width = "100%", out.height = if (knitr::is_html_output()) "400px", echo = FALSE, fig.align = "center"}
knitr::include_graphics("tlf/tlf_spec_ae.pdf")
```

The data used to summarize AE information is in `adsl` and `adae` datasets.

```{r}
adsl <- read_sas("data-adam/adsl.sas7bdat")
adae <- read_sas("data-adam/adae.sas7bdat")
```

For illustration purpose, we only provide counts in the simplified
table. The percentage of participants for each AE can be
calculated as shown in @sec-aesummary.

Here, we focus on the analysis script for two advanced
features for a table layout.

- group content: AE can be summarized in multiple nested layers.
  (e.g., by system organ class (SOC, `AESOC`) and specific AE term
  (`AEDECOD`))
- pagenization: there are many AE terms that can not be covered in one
  page. Column headers and SOC information need to be repeated on
  every page.

In the code below, we count the number of participants in each AE term by
SOC and treatment arm, and we create a new variable `order` and set it
as `0`. The variable `order` can help with
the data manipulation later.

```{r}
fmt_num <- function(x, digits, width = digits + 4) {
  formatC(
    x,
    digits = digits,
    format = "f",
    width = width
  )
}
```

```{r}
ana <- adae %>%
  mutate(
    AESOC = tools::toTitleCase(tolower(AESOC)),
    AEDECOD = tools::toTitleCase(tolower(AEDECOD))
  )

t1 <- ana %>%
  group_by(TRTAN, AESOC) %>%
  summarise(n = fmt_num(n_distinct(USUBJID), digits = 0)) %>%
  mutate(AEDECOD = AESOC, order = 0)

t1
```

In the code below, we count the number of subjects in each AE term by
SOC, AE term, and treatment arm. Here we also create a new variable
`order` and set it as `1`.

```{r}
t2 <- ana %>%
  group_by(TRTAN, AESOC, AEDECOD) %>%
  summarise(n = fmt_num(n_distinct(USUBJID), digits = 0)) %>%
  mutate(order = 1)

t2
```

We prepare reporting data for AE information using code below:

```{r}
t_ae <- bind_rows(t1, t2) %>%
  pivot_wider(
    id_cols = c(AESOC, order, AEDECOD),
    names_from = TRTAN,
    names_prefix = "n_",
    values_from = n,
    values_fill = fmt_num(0, digits = 0)
  ) %>%
  arrange(AESOC, order, AEDECOD) %>%
  select(AESOC, AEDECOD, starts_with("n"))

t_ae
```

We prepare reporting data for analysis population using code below:

```{r}
count_by <- function(data, # Input data set
                     grp, # Group variable
                     var, # Analysis variable
                     var_label = var, # Analysis variable label
                     id = "USUBJID") { # Subject ID variable
  data <- data %>% rename(grp = !!grp, var = !!var, id = !!id)

  left_join(
    count(data, grp, var),
    count(data, grp, name = "tot"),
    by = "grp",
  ) %>%
    mutate(
      pct = fmt_num(100 * n / tot, digits = 1),
      n = fmt_num(n, digits = 0),
      npct = paste0(n, " (", pct, ")")
    ) %>%
    pivot_wider(
      id_cols = var,
      names_from = grp,
      values_from = c(n, pct, npct),
      values_fill = list(n = "0", pct = fmt_num(0, digits = 0))
    ) %>%
    mutate(var_label = var_label)
}
```

```{r}
t_pop <- adsl %>%
  filter(SAFFL == "Y") %>%
  count_by("TRT01AN", "SAFFL",
    var_label = "Participants in population"
  ) %>%
  mutate(
    AESOC = "pop",
    AEDECOD = var_label
  ) %>%
  select(AESOC, AEDECOD, starts_with("n_"))

t_pop
```

The final report data is saved in `tbl_ae_spec`.
We also add a blank row between population and AE information in the reporting table.

```{r}
tbl_ae_spec <- bind_rows(
  t_pop,
  data.frame(AESOC = "pop"),
  t_ae
) %>%
  mutate(AEDECOD = ifelse(AEDECOD == AESOC,
    AEDECOD, paste0("  ", AEDECOD)
  ))

tbl_ae_spec
```

We define the format of the output as below:

To obtain the nested layout, we use the `page_by` argument in the
`rtf_body` function. By defining `page_by="AESOC"`, r2rtf recognizes
the variable as a group indicator.

After setting `pageby_row = "first_row"`, the first row is displayed as
group header. If a group of information is broken into multiple pages,
the group header row is repeated on each page by default.

We can also customize the text format by providing a matrix that has the
same dimension as the input dataset (i.e., `tbl_ae_spec`). In the code
below, we illustrate how to display **bold** text for group headers to
highlight the nested structure of the table layout.

```{r}
n_row <- nrow(tbl_ae_spec)
n_col <- ncol(tbl_ae_spec)
id <- tbl_ae_spec$AESOC == tbl_ae_spec$AEDECOD
id <- ifelse(is.na(id), FALSE, id)

text_format <- ifelse(id, "b", "")
```

```{r}
tbl_ae_spec %>%
  rtf_title(
    "Analysis of Participants With Specific Adverse Events",
    "(Safety Analysis Population)"
  ) %>%
  rtf_colheader(" | Placebo | Xanomeline Low Dose| Xanomeline High Dose",
    col_rel_width = c(3, rep(1, 3))
  ) %>%
  rtf_colheader(" | n |  n | n ",
    border_top = "",
    border_bottom = "single",
    col_rel_width = c(3, rep(1, 3))
  ) %>%
  rtf_body(
    col_rel_width = c(1, 3, rep(1, 3)),
    text_justification = c("l", "l", rep("c", 3)),
    text_format = matrix(text_format, nrow = n_row, ncol = n_col),
    page_by = "AESOC",
    pageby_row = "first_row"
  ) %>%
  rtf_footnote("Every subject is counted a single time for each applicable row and column.") %>%
  rtf_encode() %>%
  write_rtf("tlf/tlf_spec_ae.rtf")
```

```{r, include=FALSE}
rtf2pdf("tlf/tlf_spec_ae.rtf")
```

```{r, out.width = "100%", out.height = if (knitr::is_html_output()) "400px", echo = FALSE, fig.align = "center"}
knitr::include_graphics("tlf/tlf_spec_ae.pdf")
```

More discussion on `page_by`, `group_by` and `subline_by`
features can be found on the
[r2rtf package website](https://merck.github.io/r2rtf/articles/example-sublineby-pageby-groupby.html).

The procedure to generate a baseline characteristics table can be
summarized as follows:

- Step 1: Read data (i.e., `adae` and `adsl`) into R.
- Step 2: Count the number of participants by SOC and treatment arm
  (rows with bold text) and save into `t1`.
- Step 3: Count the number of participants in each AE term by SOC, AE
  term, and treatment arm (rows without bold text) and save into `t2`.
- Step 4: Bind `t1` and `t2` by row into `t_ae`.
- Step 5: Count the number of participants in each arm as `t_pop`.
- Step 6: Row-wise combine `t_pop` and `t_ae` into `tbl_ae_spec`.
- Step 7: Format the output by using r2rtf.