At times it is important to determine if the event rate in a cohort is statistically different from the event rate in a reference population. This has traditionally been done by calculating a Standardized Mortality Ratio (SMR). The analysis assumes the researcher has a table of event rates for different combinations of ages or categories in a reference population, which are then applied to the collected data to estimate the number of events if the collected data was subject to the same background event rate as the reference population. The goal is generally to determine if the observed events are statistically different than the expected number of events, and to investigate how much of an effect difference covariate have.
One method to approximate this is to fit a model to the SMR. The simplest model is to estimate the true rate (λi) as a multiple of the expected rate (λi*). This provides an SMR for the full cohort, assuming one has an estimated rate for every row of data.
$$ \begin{aligned} \lambda_i = \beta * \lambda^{*}_{i} \end{aligned} $$
For this analysis we will use the veteran lung cancer trial data from the survival package. We will treat events to be status equal to 2, scale the survival time to be per 100 days, and assume every row has an expected event rate of 0.5, roughly double the average rate in the data. In practice this could be a constant population average or a row specific rate taken from mapping a table of external events onto the data.
data(cancer, package = "survival")
cancer$status <- as.integer(cancer$status == 2)
cancer$time <- cancer$time / 100
cancer$erate <- 0.5
The analysis proceeds identically to a standard Poisson regression. We are interested in a linear relationship between the expected event rate and true event rate, and set the initial parameter estimate to 1.
pyr <- "time"
event <- "status"
names <- c("erate")
term_n <- c(0)
tform <- c("lin")
keep_constant <- c(0)
a_n <- c(1)
modelform <- "M"
fir <- 0
der_iden <- 0
control <- list(
"ncores" = 1, "maxiter" = 20, "halfmax" = 5, "epsilon" = 1e-9,
"deriv_epsilon" = 1e-9, "verbose" = 1
)
e <- RunPoissonRegression_Omnibus(
cancer, pyr, event, names, term_n, tform,
keep_constant, a_n, modelform, fir, der_iden, control
)
print(e$beta_0)
#> [1] 0.4741856
In this case we found the SMR to be 0.474, analysis of the confidence interval (either Wald or Likelihood-based) could be performed to check if the results are statistically significant. Lets suppose we wanted to take the analysis a step further and investigate the effect of difference covariates. The SMR equation can be adjusted to either an additive or multiplicative effect, similar to any other Poisson regression model. Let us assume we are interested in the effects of biological sex. We can add another element to our model and rerun the regression. Let us start with holding the constant multiple (β0) constant at 1 and investigate if the difference in SMR is entirely due to biological sex.
$$ \begin{aligned} \lambda_i = \beta_0 * \lambda^{*}_{i} *( 1 + \beta_1 * x_{0,i}) \end{aligned} $$
names <- c("erate", "sex")
term_n <- c(0, 1)
tform <- c("lin", "lin")
keep_constant <- c(1, 0)
a_n <- c(1, 1)
e <- RunPoissonRegression_Omnibus(
cancer, pyr, event, names, term_n, tform,
keep_constant, a_n, modelform, fir, der_iden, control
)
print(e$beta_0)
#> [1] 1 0
In this case the estimated contribution of biological sex was 0, so we could not find any effect of biological sex on the difference in SMR. Suppose we let both parameter be free and investigated the combined effects. Once again we run the regression, this time with neither parameter held constant.
names <- c("erate", "sex")
term_n <- c(0, 1)
tform <- c("lin", "lin")
keep_constant <- c(0, 0)
a_n <- c(1, 1)
e <- RunPoissonRegression_Omnibus(
cancer, pyr, event, names, term_n, tform,
keep_constant, a_n, modelform, fir, der_iden, control
)
print(e$beta_0)
#> [1] 0.3696187 0.1904852
In this case we found the SMR for sex=0 to be 0.37 and the SMR for sex=1 to be 0.56. Once again we could further investigate the confidence intervals to determine if these estimates are statistically significant, however this demonstrates how one can use Poisson regressions and external event rates to estimate differences in event rates between external populations and studied populations.