Two-stage optimal designs with survival endpoint when the follow-up time is restricted

Shan, Guogen; Zhang, Hua

doi:10.1186/s12874-019-0696-x

Research article
Open access
Published: 03 April 2019

Two-stage optimal designs with survival endpoint when the follow-up time is restricted

BMC Medical Research Methodology volume 19, Article number: 74 (2019) Cite this article

4265 Accesses
11 Citations
6 Altmetric
Metrics details

Abstract

Background

Survival endpoint is frequently used in early phase clinical trials as the primary endpoint to assess the activity of a new treatment. Existing two-stage optimal designs with survival endpoint either over estimate the sample size or compute power outside the alternative hypothesis space.

Methods

We propose a new single-arm two-stage optimal design with survival endpoint by using the one-sample log rank test based on exact variance estimates. This proposed design with survival endpoint is analogous to Simon’s two-stage design with binary endpoint, having restricted follow-up.

Results

We compare the proposed design with the existing two-stage designs, including the two-stage design with survival endpoint based on the nonparametric Nelson-Aalen estimate, and Simon’s two-stage designs with or without interim accrual. The new design always performs better than these competitors with regards to the expected total study length, and requires a smaller expected sample size than Simon’s design with interim accrual.

Conclusions

The proposed two-stage minimax and optimal designs with survival endpoint are recommended for use in practice to shorten the study length of clinical trials.

Peer Review reports

Background

A multiple-stage design is often preferable in early phase clinical trials to investigate the activity of a new treatment. Such design is able to protect patients better as compared to the traditional one-stage design by allowing a trial to be stopped earlier when the new treatment is indeed ineffective. For this reason, early stopping for futility is always allowed in these trials. Among multiple-stage designs, a two-stage design is widely used in phase II clinical trials whose sample size is relatively smaller than that in the following phase III trial to confirm the effectiveness of the new treatment(s).

When the outcome is binary (e.g., response VS non-response), Simon’s two-stage minimax and optimal designs are widely used in practice [1–8]. When the required number of patients in the first stage are enrolled, a trial generally has to be suspended temporally to allow these patients completing the treatment schedule. After that, data analysis is performed to make the decision whether a trial proceeds to the second stage or not, based on the result from the first stage. This suspension during the clinical trial could lead to a longer study time as compared to the modified Simon’s two-stage design with interim accrual [9]. Recently, adaptive version of Simon’s two-stage design has been proposed to improve the flexibility of trials [3, 4, 10–12]. In such trials, the second stage sample size depends on the outcome from the first stage.

In some other trials (e.g., cytostatic therapies), a survival endpoint is served as the primary outcome to measure the activity of a new treatment. Feldman et al. [13] reviewed seven single-arm phase II trials for patients with refractory germ cell tumors, and recommended a 12-week progression-free survival as compared to the commonly used response rate, to test the activity of novel agents. For such trials, a multiple-stage design with survival endpoint would be appropriate for use in practice. Lin et al. [14] proposed group sequential designs for a trial with survival endpoint by deriving the asymptotic joint distribution of the Nelson-Aalen estimates at different time points. Base on Lin et al.’s work, Case and Morgan [9] developed a two-stage optimal design evaluating survival probabilities with restricted follow-up. They proposed two-stage optimal designs with the smallest expected duration of accrual or the smallest expected total study length. Later, Kwak and Jung [15] proposed a new two-stage optimal design based on the one-sample log-rank test without follow-up restriction. Power of their proposed design was computed under the average of the cumulative hazard function under the null hypothesis and that under the alternative hypothesis. In addition, the asymptotic variance estimate of the one-sample log-rank test was used in type I error rate and power calculation. Recently, Belin et al. [16] proposed a two-stage design based on the design setting as in Kwak and Jung [15], but having restricted follow-up as in Case and Morgan [9].

For a trial with a survival endpoint as the primary outcome, the survival probability at the clinically meaningful follow-up time is often the parameter of interest, (e.g., the survival probability at 1 year). We develop a new single-arm two-stage optimal design by using the one-sample log-rank test with exact mean and variance estimates [17, 18]. A trial is allowed to be stopped in the first stage due to futility to protect patients when the treatment under investigation is indeed ineffective. Although exact mean and variance estimates of the one-sample log-rank test are used for sample size calculation, the joint distribution of the test statistic for the first stage and that for the two stages combined is assumed to asymptotically follow a bivariate normal distribution. For this reason, the actual power of the identified study design may not be guaranteed [19]. We propose adjusting the nominal power level in design search to guarantee that the new designs meet the power requirement. The proposed two-stage minimax and optimal designs with survival endpoint are compared with the design by Belin et al. [16] and Simon’s two-stage designs with or without interim accrual.

The rest of this article is organized as follows. In Section Methods, we present the type I error rate and power calculation for a two-stage design with survival endpoint by using the one-sample log-rank test, and provide a detailed search method for two-stage minimax and optimal designs. In Section Results, we compare the performance of the new proposed two-stage designs with the existing Belin’s design with survival endpoint and Simon’s two-stage design with binary endpoint. At the end of that section, we revisit two trials to illustrate the application of the proposed two-stage designs with survival endpoint. Lastly, we provide some comments in Section Discussion.

Methods

Suppose S(t) is the survival function of the survival time T. In a single-arm study, the survival probability of a new treatment at the clinically meaningful follow-up time t_c, S(t_c), is compared to the estimated historical survival probability, S₀(t_c). Then the hypotheses are presented as

$$ H_{0}: S(t_{c})\leq S_{0}(t_{c}) \ \ \text{against} \ \ H_{1}: S(t_{c})> S_{0}(t_{c}). $$

(1)

In this article, the survival function S(t) is assumed to follow the Weibull distribution with the shape parameter k and the scale parameter λ, specifically,

$$S(t)=\exp^{-(t/\lambda)^{k}},$$

where k>0 and λ>0. The widely used exponential distribution is a special case of the Weibull distribution when k=1.

Under the Weibull distribution for survival outcome, suppose the failure rate under the null hypothesis is the same as that under the alternative hypothesis (the same shape parameter k), but scale parameters are different with λ₀ and λ₁ under the null hypothesis and the alternative hypothesis, respectively. Then, Δ=(λ₀/λ₁)^k is the hazard ratio (HR), which is always less than 1 under the alternative. The hypotheses in Eq. (1) can be specifically rewritten as

$$ H_{0}: \Delta\geq 1 \ \ \text{against} \ \ H_{1}: \Delta<1. $$

(2)

When a new study is assumed to have a different failure rate as historical data, the HR is then calculated as $\Delta =\frac {\lambda _{0}^{k_{0}}}{\lambda _{1}^{k_{1}}} \times \frac {k_{1} t^{k_{1}-1}}{k_{0} t^{k_{0}-1}}$, where k₀ and k₁ are the shape parameter under the null hypothesis and that under the alternative hypothesis, respectively.

Simon’s two-stage designs with binary endpoint

In Simon’s two-stage optimal designs, a trial is allowed to be stopped in the first stage when the number of responses is insufficient. Suppose X₁ and X are the number of responses out of n₁ and n participants from the first stage and the two stages combined, respectively. The sample size in the second stage is n₂=n−n₁. The null hypothesis is rejected when X₁>r₁ and X>r, where r₁ and r are the critical values for the number of responses from the first stage and both stages, respectively.

In a pancreatic cancer trial with a combination of Gemcitabine and external beam radiation as the new treatment [9], the clinically meaningful follow-time is 1 year, t_c=1. The unacceptable one-year survival rate is S₀(1)=35%, and the new treatment is considered as promising for further investigation when S₁(1)=50% or more. To attain 90% power of the study at the significance level of 10%, Simon’s two-stage minimax design [1] is calculated as:

$$(n_{1},r_{1},n,r)=(43,14,72,30),$$

with the expected sample size under the null hypothesis ESS₀=n₁+(1−PET)n₂=59.3, where PET is the probability of early termination under the null hypothesis which is defined as PET=p(X₁≤r₁|S₀(1)=35%)=43.65%. Suppose this is a 3 year study with the patient accrual rate of θ=24 patients per year. Then the enrollment time for the first stage and the second stage is calculated t₁=n₁/θ and t₂=n₂/θ, respectively. The expected total study length (ETSL) under the null hypothesis is calculated as

$${ETSL}_{0}=(t_{1}+t_{c})+(1-PET)(t_{2}+t_{c})=4.0 \ \text{years}$$

The two-stage optimal design needs ESS₀=53.2 and ETSL₀=3.6 years (see Table 1). The maximum possible sample size for Simon’s optimal design n=81 is much larger than n=72 for Simon’s minimax design.

Table 1 The resectable pancreatic cancer clinical trial with S₀(t_c=1)=35%, and S₁(t_c=1)=50% to attain 90% power at the significance level of 10%

Full size table

When Simon’s two-stage design allows interim accrual at the end of the first stage, the expected sample size under the null hypothesis is calculated as

$${ESS}_{0}=n_{1}+\theta t_{c} +(1-PET) (n_{2}-\theta t_{c}),$$

and the expected total study length under the null hypothesis is

$${\begin{aligned} {ETSL}_{0}&=(t_{1}+t_{c})+(1-PET) \left[(t_{2}-t_{c})+t_{c}\right]\\&\quad=(t_{1}+t_{c})+(1-PET) t_{2} \end{aligned}} $$

The results of Simon’s two-stage designs with interim accrual are presented in Table 1. As compared to the traditional Simon’s two-stage design without interim accrual, the modified design with interim accrual requires a shorter ETSL₀ but a larger ESS₀.

Two-stage optimal designs with survival endpoint when the follow-up time is limited

In a two-stage design with sample sizes of n₁ in the first stage and n₂ in the second stage, the maximum possible sample size in the study is n=n₁+n₂. Given the patient accrual rate of θ, the accrual time for the first stage is t₁=n₁/θ. When the trial goes to the second stage, the total accrual time of the study is t_a=n/θ, and the total study time for all patients to complete the study is t=t_a+t_c.

We assume that patients are uniformly enrolled in the study, with the entering times of τ₁,τ₂,⋯,τ_n. They have the survival times of T₁,T₂,⋯,T_n and the censoring times of C₁,C₂,⋯,C_n. At the end of the first stage t₁, the observed time for the i-th patient is the smallest of the following three measurements: (1) event time; (2) censoring time; and (3) time that this patient is followed so far in the study, specifically,

$$O_{i}=\min(T_{i}, C_{i}, \max(0,t_{1}-\tau_{i})).$$

By using the observed time and the censoring information of the first n₁ patients, the one-sample log-rank test can be calculated as

$$Z_{1}=\frac{W_{1}}{\hat\sigma_{1}},$$

where W₁ is a function of the difference between observed number of events and the expected number of events, and $\hat \sigma _{1}$ is its standard deviation estimate. Please find the detailed formula of Z₁ under the null hypothesis and the alternative hypothesis in Appendix.

The null hypothesis is rejected when a small test statistic is observed. Suppose the critical value for Z₁ is c₁. When the calculated Z₁ is larger than or equal to c₁, the trial is stopped for futility and no further investigation is warranted. Otherwise, the trial goes to the second stage with additional n₂=n−n₁ patients treated by the new treatment. At the end of study when all n patients complete the study, the one-sample log-rank test is calculated as

$$Z=\frac{W}{\hat\sigma}.$$

It can be seen that Z₁ and Z are not independent from each other since the data of the first n₁ patients is used in both Z₁ and Z. The type I error (TIE) rate of the study is calculated as

$$TIE=P(Z_{1}\leq c_{1}, Z\leq c | H_{0}),$$

where c is the critical value for Z.

Following Kwak and Jung [15], the joint distribution of (Z₁,Z) is a bivariate normal distribution asymptotically. Then, the TIE can be specifically written as

$$ TIE=\int_{-\infty}^{c} \phi(t) \Phi\left(\frac{c_{1}-\rho_{0} t}{\sqrt{1-\rho_{0}^{2}}}\right) d t, $$

(3)

where ϕ and Φ are the probability density function and the cumulative distribution function of the standard normal distribution, and ρ₀ is the correlation coefficient estimate between Z₁ and Z under the null hypothesis, see Appendix for the detailed formula for ρ₀. The actual power of the study can be computed similarly with ρ₀ being replaced by the ρ estimate under the alternative hypothesis.

Optimal design search

Similar to the search for Simon’s two-stage design, the two-stage optimal design with survival endpoint has to be searched over all the possible sample sizes (n₁ and n) and critical values (c₁ and c), given the design parameters (α,β,t_c,S₀(t_c),S₁(t_c),θ).

Although the exact variances of Z₁ and Z are available for use in sample size determination, the exact joint distribution of Z₁ and Z is not that straightforward. For this reason, we utilize the limiting distribution of (Z₁,Z) in searching for the two-stage optimal design for a study with the design parameters (α,β,t_c,S₀(t_c),S₁(t_c),θ), then use a simulation study to calculate the actual TIE and power of the optimal design. The following three steps are used to search for the two-stage minimax and optimal designs.

Step 1: Given the total sample size n, the range of the first stage sample size n₁ is from 1 to n−1. The critical value c₁ from -0.3 to 1.6 with an increment of 0.005 is used in the design search. Similar to Kwak and Jung [15], the range of c₁ is chosen based on the simulation studies for all the configurations studied in this article. The range of c₁ is modifiable in the software program for design search.

For each combination of n₁ and c₁, the critical value c can be determined as the largest c value such that TIE(c)≤α from Eq. (3). Power of the study is then computed by using Eq. (4) in Appendix. If power is above the nominal level, this set of sample sizes and critical values, (n₁,c₁,n,c), is saved as a candidate for the optimal two-stage design. Among all the sets satisfying the power requirement, the one with the smallest ESS₀ is the optimal two-stage design when the total sample size is n, and it is denoted as B(n)=(n₁,c₁,n,c) whose expected sample size is ESS₀(n).

Step 2: The design search starts with a relatively small n (e.g., 5) with an increment of 1, and B(n) could be a empty set when n is small. The two-stage minimax design is the one with the smallest n, n_minimax such that B(n) is not empty. The optimal two-stage design is the one with the smallest ESS₀. The search may be stopped at n_u when its ESS₀(n_u) is 10% more than the smallest ESS₀ from the identified optimal designs with n from n_minimax to n_u: ESS₀(n_u)≥110%× min{ESS₀(n):n_minimax≤n≤n_u}.

Step 3: Once the minimax and optimal two-stage designs are identified from Step 1 and Step 2, we use a simulation study to calculate the actual TIE and power based on 100,000 simulations. We find that the actual TIE of the optimal design B(n)=(n₁,c₁,n,c) is always guaranteed, while power may not be preserved in some cases. If the simulated power of the two designs meet the nominal levels, they are the final two-stage minimax and optimal designs. Otherwise, we search for the designs again with the power nominal level being increased by 1%, (α,β−1%) in Step 1 and Step 2 again. This process is stopped when both minimax and optimal two-stage designs meet the power requirement.

Results

We first compare the proposed two-stage minimax and optimal designs with survival endpoint when the follow-up time is restricted, with the designs developed by Belin et al. [16] (referred to as Belin’s design). They developed a two-stage optimal design as a modification of the design by Kwak and Jung [15] by adding restricted follow-up in the study design [9]. In Belin’s design, power of the study is computed at the average of the cumulative hazard functions under the null and the alternative, that is less than the cumulative hazard functions under the alternative at which value the actual power should be computed. This leads to an decreased effect size in sample size calculation; thus, the computed sample size may be over-estimated. As a result of the over-estimated sample size, the actual power is often above the nominal level.

Table 2 shows the comparison between the proposed designs with Belin’s design, when the survival distribution follows an exponential distribution. Belin et al. [16] investigated the performance of two-stage optimal designs with restricted follow-up under exponential distributions only (the shape parameter k=1 in the Weibull distribution). The clinically meaningful follow-up time t_c is assumed to be 1 year. Under the null hypothesis, the survival rate at t_c=1 is S₀(t_c)=50% (λ₀=1.44) as studied in Table 2. The hazard ratio is assumed to be 0.5, which is Δ=λ₀/λ₁=0.5. Then the scale parameter under the alternative is λ₁=2.88. The nominal power level is set as either 90% or 95%. The accrual rate θ is 15, 30, or 50. The ESS₀ of the proposed minimax or optimal designs is often less than that of the Belin’s design, that may be due to the fact that power of Belin’s design is computed outside the alternative hypothesis space. The simulated TIE and power of the developed two-stage minimax and optimal designs are shown in Table 3. In Table 3, we also report the 95% confidence interval for the TIE and power based on 1,000 simulated TIE and power values, where each simulated TIE and power are computed using 10,000 simulations. It can be seen that the proposed designs control for TIE and power.

Table 2 Comparison between the proposed two-stage minimax and optimal designs with survival endpoint and Belin’s two-stage optimal design with survival endpoint, when the follow-up time is restricted to the clinically meaningful follow-up time t_c=1 year

Full size table

Table 3 Simulated TIE and power of the proposed two-stage minimax and optimal designs in Table 2

Full size table

We further compare the proposed two-stage minimax and optimal designs with survival endpoint, with Simon’s two-stage designs with or without interim accrual for a trial with binary endpoint, see Table 4 when the survival distribution follows the Weibull distribution with a common shape parameter of k=0.5. The significance level is set as 5%, and the nominal power level is 80%. The null survival probability at the clinically meaningful follow-up time t_c=1, S₀(t_c)=10% and 60% are studied in Table 4. We consider a medium to large effect size as S₁(t_c)−S₀(t_c)= 10%, 15%, and 20%. For each configuration of S₀(t_c) and S₁(t_c), the scale parameters λ₀ and λ₁ in the Weibull distribution can be calculated, the ESS₀ and ETSL₀ of the proposed minimax design and Simon’s minimax design are computed. Patient accrual rate θ is calculated by assuming it is a 3 year study when Simon’s two-stage minimax design is used. In the table, percentage (%) is for the ESS₀ or the ETSL₀ percentage saving of the proposed two-stage design with survival endpoint as compared to Simon’s two-stage design, which is computed as (Simon-New)/Simon. When the percentage saving is positive, the new design requires a smaller ESS₀ or a shorter ETSL₀ as compared to the existing Simon’s design. When the null survival probability S₀(t_c) is low, say 10%, the proposed two-stage design with survival endpoint saves sample size as compared to Simon’s two-stage minimax design. This trend is reversed when S₀(t_c)=60%. In Table 4, we also present the results of Simon’s two-stage minimax design with interim accrual. It can be seen that the new design always requires a smaller ESS₀ than Simon’s design with interim accrual. The new design always saves the ETSL₀ as compared to Simon’s design with or without interim accrual. The saving becomes smaller as the null survival probability goes up from 10% to 60%. Similar results are observed in Table 5 for the two-stage optimal designs.

Table 4 Comparison between the proposed two-stage minimax design with survival endpoint and Simon’s two-stage minimax design with binary endpoint with or without interim accrual, when α=5%, β=20%, and the shape parameter k=0.5 in the Weibull distribution

Full size table

Table 5 Comparison between the proposed two-stage optimal design with survival endpoint and Simon’s two-stage optimal design with binary endpoint with or without interim accrual, when α=5%, β=20%, and the shape parameter k=0.5 in the Weibull distribution

Full size table

We further compare the new two-stage minimax design with Simon’s two-stage minimax design with the shape parameter k from 0.25 to 2 in Fig. 1 for a trial to attain 90% power at the significance level of 5%. When S₀(t_c) is low, the new design needs a smaller expected sample size than Simon’s minimax design, and this trend is reversed when S₀(t_c) is high, e.g., 40%, and 75%. The saving of the new design often decreases as k goes up. The new design always requires a shorter expected total study length than Simon’s minimax design. Similar results are observed in Fig. 2 where the new two-stage optimal design is compared with Simon’s optimal design. We also compare the new design with Simon’s two-stage minimax and optimal designs with interim accrual in Fig. 3 and Fig. 4, respectively. The results indicate that the new design performs better than Simon’s design with interim accrual with regards to both ESS₀ and ETSL₀.

Examples

We revisit the cancer trial discussed by Case and Morgan [9] in “Simon’s two-stage designs with binary endpoint” subsection to investigate the effectiveness of a combination of Gemcitabine and external beam radiation for patients with resectable pancreatic cancer. The clinically meaningful follow-up time is assumed to be 1 year, t_c=1. The survival probability under the null and the alternative are S₀(1)=35%, and S₁(1)=50%, respectively. The survival function follows an exponential distribution. This trial is designed to attain 90% power at the significance level of 10%. We compute the detailed two-stage designs with survival endpoint, including sample sizes and critical values for each stage in Table 1. The ESS₀ of the new design is slightly larger than that of Simon’s design, but much smaller than that of Simon’s design with interim accrual. The ETSL₀ of the new design is always shorter than that of Simon’s designs with or without interim accrual, and the study time saving is substantial.

We also consider a second clinical trial evaluating the activity of a combination of irinotecan and cisplatin for patients with refractory or recurrent non-small cell lung cancer [20]. The response rates are 10% and 25% under the null and the alternative hypotheses. Suppose the clinically meaningful follow-up time is 1 year. For Simon’s two-stage optimal design when α=5% and β=20%, the maximum possible sample size is n=43 and the expected sample size under the null hypothesis is ESS₀=24.7, see Table 5 for the case with S₀(t_c)=10% and S₁(t_c)=25%. The proposed new two-stage optimal design with survival endpoint needs a slightly smaller ESS₀ as 24.0, and can save the expected total study length by almost 1 year (2.2 VS 3.1 from Simon’s design). A 95% two-sided confidence interval of the response rate was reported in the original research article by Takiguchi et al. [20]. The hypothesis is one sided in both Simon’s design and the proposed design. Therefore, a 90% two-sided confidence interval for the response rate or the survival rate should be reported when α=5%.

Discussion

In the design search process, we search for the minimax and optimal designs when both designs have power above the nominal level. In practice, when one type of design is of interest (e.g., the two-stage minimax design), we would suggest searching for the design such that power of this particular type design is above the nominal level. The written R program computes the designs to have both the minimax design and the optimal design meet the nominal power level, which is available upon request from the first author.

Conclusions

The commonly used Simon’s two-stage design has to suspend the enrollment temporally after n₁ patients enrolled in the first stage [5, 11, 21–28]. The research team has to wait a while (t_c) until all n₁ patients complete the study. The calculated test statistic from the first stage is then compared to the pre-determined critical value to make a go or no-go decision to the second stage. Meanwhile, the proposed two-stage designs with survival endpoint do not have to suspend the trial, thus the comparison between the proposed design with Simon’s two-stage design with no interim accrual is not very appropriate. Due to the popularity of Simon’s two-stage design, we include this design as reference. Simon’s two-stage design with interim accrual is a reasonable competitor for the proposed two-stage design with survival endpoint.

Appendix

Test statistics of Z ₁ and Z

At the end of the first stage t₁, the observed time for the i-th patient is O_i= min(T_i,C_i, max(0,t₁−τ_i)), where C_i=t_c with restricted follow-up, and i=1,2,⋯,n₁. Let N_i(t)=I(T_i≤ min(C_i, max(0,t−τ_i)))I(T_i≤t) and Y_i(t)=I(T_i≥t,T_i≥t_c) be the event process and the at-risk process, respectively. The one-sample log-rank test at the end of the first stage is expressed as:

$$Z_{1}=\frac{O-E}{\sqrt{E}}, $$

where $O=\sum _{i=1}^{n} \int _{0}^{\infty } d N_{i}(t)$ are $E=\sum _{i=1}^{n} \int _{0}^{\infty } Y_{i} (t) d \Lambda _{0}(t)$ are the observed number of events and the expected number of events, respectively. The one-sample log-rank test can be alternatively written as

$$Z_{1}=\frac{W_{1}}{\hat\sigma_{1}}, $$

where $W_{1}=(O-E)/\sqrt {n}$ and $\hat \sigma =E/n$, and $\hat \sigma _{1}^{2}$ is the variance estimate of W₁. The one-sample log-rank test Z at the end fo the study can be derived similarly by replacing N_i(t) with N_i(t)=I(T_i≤C_i)I(T_i≤t).

Mean and variance estimates of W ₁ and W under the null hypothesis

The mean of W₁ or W under the null hypothesis is 0. The clinically meaningful follow-up time t_c is the upper bound follow-up time for each patient, then the censoring distribution is G(t)=I(t≤t_c). The censoring distribution for the first stage is G₁(t)=U(0,t₁)I(t≤t_c) due to a possible short follow-up time at the data analysis time t₁. Then, the variances of W₁ and W are estimated as

$${\begin{aligned} \sigma_{01}^2=Var(W_{1})=-\int_{0}^{t_{c}} G_{1}(t)d S_{0}(t)\ \text{and} \\ \ \sigma_{02}^2=Var(W)=-\int_{0}^{t_{c}} G(t)d S_{0}(t). \end{aligned}} $$

It follows that the correlation between W₁ and W under H₀ is ρ₀=σ₀₁/σ₀₂. The TIE in Eq. (3) can then be computed after the correlation coefficient ρ₀ being estimated.

Mean and variance estimates of W ₁ and W under the alternative hypothesis

Under the alternative hypothesis, the mean values of W₁ and W are

$$E(W_{1})=\sqrt{n_{1}} \omega_{1}\ \text{and} \ \ E(W)=\sqrt{n} \omega $$

where ω=p₁−p₀, $p_{1}=\int _{0}^{t_{c}} G(t)S_{1}(t) d \Lambda _{1}(t)$, $p_{0}=\int _{0}^{t_{c}} G(t)S_{1}(t) d \Lambda _{0}(t)$, and ω₁=p_1f−p_0f, $p_{1f}=\int _{0}^{t_{c}} G_{1}(t)S_{1}(t) d \Lambda _{1}(t)$, $p_{0f}=\int _{0}^{t_{c}} G_{1}(t)S_{1}(t) d \Lambda _{0}(t)$. Recently, Wu [17] derived the exact variance of W under the alternative hypothesis as

$$\sigma_{12}^2=Var(W)=p_{1}-p_{1}^2-p_{0}^2+2p_{0} p_{1} +2 p_{00}-2 p_{01}, $$

where $p_{00}=\int _{0}^{t_{c}} G(t)S_{1}(t) \Lambda _{0}(t) d \Lambda _{0}(t)$ and $p_{01}=\int _{0}^{t_c} G(t)S_{1}(t) \Lambda _{0}(t) d \Lambda _{1}(t)$. The exact variance of W₁, $\sigma _{11}^2=Var(W_{1})$, can be derived similarly. It follows that the correlation between W₁ and W under H₁ is ρ₁=σ₁₁/σ₁₂, and power of a two-stage design is

$$ Power=\int_{-\infty}^{\tilde{c}} \phi(t) \Phi\left(\frac{\tilde{c_{1}}-\rho_{1} t}{\sqrt{1-\rho_{1}^{2}}}\right) d t, $$

(4)

where $\tilde {c_{1}}=\frac {\sigma _{01}}{\sigma _{11}}\left (c_{1}-\frac {\omega _{1} \sqrt {n_{1}}}{\sigma _{01}}\right)$, and $\tilde {c}=\frac {\sigma _{02}}{\sigma _{12}}\left (c-\frac {\omega _{2} \sqrt {n_{2}}}{\sigma _{02}}\right)$.

Abbreviations

ESS:: Expected sample size
ETSL:: Expected total study length
PET:: Probability of early termination
TIE:: Type I error

References

Simon R. Optimal two-stage designs for phase II clinical trials. Control Clin Trials. 1989; 10(1):1–10.
Article CAS Google Scholar
Fleming TR. One-sample multiple testing procedure for phase II clinical trials. Biometrics. 1982; 38(1):143–51.
Article CAS Google Scholar
Shan G, Wilding GE, Hutson AD, Gerstenberger S. Optimal adaptive two-stage designs for early phase II clinical trials. Stat Med. 2016; 35(8):1257–66. https://doi.org/10.1002/sim.6794.
Article Google Scholar
Shan G, Zhang H, Jiang T. Minimax and admissible adaptive two-stage designs in phase II clinical trials. BMC Med Res Methodol. 2016; 16(1):90. https://doi.org/10.1186/s12874-016-0194-3.
Article Google Scholar
Shan G. Exact confidence limits for the response rate in two-stage designs with over- or under-enrollment in the second stage. Stat Methods Med Res. 2018; 27(4):1045–55.
Article Google Scholar
Shan G, Hutson AD, Wilding GE. Two-stage k-sample designs for the ordered alternative problem. Pharmaceut Statist. 2012; 11(4):287–94. https://doi.org/10.1002/pst.1499.
Article Google Scholar
Wilding GE, Shan G, Hutson AD. Exact two-stage designs for phase II activity trials with rank-based endpoints. Contemp Clin Trials. 2012; 33(2):332–41. https://doi.org/10.1016/j.cct.2011.10.008.
Article Google Scholar
Shan G, Wilding GE, Hutson AD. Computationally Intensive Two-Stage Designs for Clinical Trials In: Balakrishnan N, Colton T, Everitt B, Piegorsch W, Ruggeri F, Teugels JL, editors. Wiley StatsRef: Statistics Reference Online: 2017. p. 1–7. https://doi.org/10.1002/9781118445112.stat07986.
Case DD, Morgan TM. Design of Phase II cancer trials evaluating survival probabilities. BMC Med Res Methodol. 2003; 3:6. https://doi.org/10.1186/1471-2288-3-6.
Article Google Scholar
Berry DA. Adaptive clinical trials: the promise and the caution. J Clin Oncol. 2011; 29(6):606–9. https://doi.org/10.1200/jco.2010.32.2685.
Article Google Scholar
Shan G, Chen JJ. Optimal inference for Simon’s two-stage design with over or under enrollment at the second stage. Commun Stat Simul Comput. 2017:1–11. https://doi.org/10.1080/03610918.2017.1307398.
Shan G, Wang W. Exact one-sided confidence limits for Cohen’s kappa as a measurement of agreement. Stat Methods Med Res. 2017; 26(2):615–32. https://doi.org/10.1177/0962280214552881.
Article Google Scholar
Feldman DR, Patil S, Trinos MJ, Carousso M, Ginsberg MS, Sheinfeld J, Bajorin DF, Bosl GJ, Motzer RJ. Progression-free and overall survival in patients with relapsed/refractory germ cell tumors treated with single-agent chemotherapy: endpoints for clinical trial design. Cancer. 2012; 118(4):981–6.
Article Google Scholar
Lin DY, Shen L, Ying Z, Breslow NE. Group sequential designs for monitoring survival probabilities. Biometrics. 1996; 52(3):1033–41.
Article CAS Google Scholar
Kwak M, Jung S-HH. Phase II clinical trials with time-to-event endpoints: optimal two-stage designs with one-sample log-rank test. Stat Med. 2014; 33(12):2004–16.
Article Google Scholar
Belin L, De Rycke Y, Broët P. A two-stage design for phase II trials with time-to-event endpoint using restricted follow-up. Contemp Clin Trials Commun. 2017; 8:127–34.
Article Google Scholar
Wu J. Sample size calculation for the one-sample log-rank test. Pharm Stat. 2015; 14(1):26–33.
Article Google Scholar
Huang B, Talukder E, Thomas N. Optimal Two-Stage Phase II Designs with Long-Term Endpoints. Stat Biopharm Res. 2010; 2(1):51–61.
Article Google Scholar
Whitehead J. One-stage and two-stage designs for phase II clinical trials with survival endpoints. Stat Med. 2014; 33(22):3830–43.
Article Google Scholar
Takiguchi Y, Moriya T, Asaka-Amano Y, Kawashima T, Kurosu K, Tada Y, Nagao K, Kuriyama T. Phase II study of weekly irinotecan and cisplatin for refractory or recurrent non-small cell lung cancer. Lung Cancer (Amst, Neth). 2007; 58(2):253–9.
Article Google Scholar
Shan G, Ma C. Unconditional tests for comparing two ordered multinomials. Stat Methods Med Res. 2016; 25(1):241–54. https://doi.org/10.1177/0962280212450957.
Article Google Scholar
Zhang H, Shan G. Letter to the Editor: A novel confidence interval for a single proportion in the presence of clustered binary outcome data (SMMR, 2019). Stat Methods Med Res. 2019. https://doi.org/10.1177/0962280219840056.
Shan G, Kang L, Xiao M, Zhang H, Jiang T. Accurate unconditional p-values for a two-arm study with binary endpoints. J Stat Comput Simul. 2018; 88(6):1200–10.
Article Google Scholar
Shan G, Zhang H, Jiang T. Efficient confidence limits for adaptive one-arm two-stage clinical trials with binary endpoints. BMC Med Res Methodol. 2017; 17(1):22. https://doi.org/10.1186/s12874-017-0297-5.
Article Google Scholar
Shan G, Banks S, Miller JB, Ritter A, Bernick C, Lombardo J, Cummings JL. Statistical advances in clinical trials and clinical research. Alzheim Dement (NY). 2018; 4:366–71.
Google Scholar
Shan G. Exact confidence limits for the probability of response in two-stage designs. Statistics. 2018; 52(5):1086–95. https://doi.org/10.1080/02331888.2018.1469023.
Article Google Scholar
Shan G. Exact Statistical Inference for Categorical Data, 1st edn. San Diego: Academic Press; 2015. http://www.worldcat.org/isbn/0081006810.
Google Scholar
Wilding GE, Consiglio JD, Shan G. Exact approaches for testing hypotheses based on the intra-class kappa coefficient. Stat Med. 2014; 33(17):2998–3012. https://doi.org/10.1002/sim.6135.
Article Google Scholar

Download references

Acknowledgment

We would like to thank Dr. Jianrong Wu and Dr. Lisa Belin for sharing their R codes with us. Authors would like to thank Associate Editor and two referees, for their valuable comments and suggestions that helped to improve this manuscript.

Funding

Shan’s research is partially supported by grants from the National Institute of General Medical Sciences from the National Institutes of Health: P20GM109025. Zhang’s work is supported by the Zhejiang Provincial Natural Science Foundation of China (grant no. LY19F020003) and the National Natural Science Foundation of China (grant no. 61672459).

Availability of data and materials

Not applicable. This is a manuscript to develop novel statistical approaches, therefore, no real data is involved.

Author information

Guogen Shan and Hua Zhang contributed equally to this work and considered co-first author.

Authors and Affiliations

Epidemiology and Biostatistics Program, Department of Environmental and Occupational Health, School of Community Health Sciences, University of Nevada Las Vegas, Las Vegas, 89154, NV, USA
Guogen Shan
School of Computer and Information Engineering, Zhejiang Gongshang University, Hangzhou, Zhejiang, China
Hua Zhang

Authors

Guogen Shan
View author publications
You can also search for this author in PubMed Google Scholar
Hua Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The idea for the paper was originally developed by GS. GS and HZ computed the required sample size for a two-stage design with a survival endpoint. GS and HZ drafted the manuscript and approved the final version.

Corresponding authors

Correspondence to Guogen Shan or Hua Zhang.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Shan, G., Zhang, H. Two-stage optimal designs with survival endpoint when the follow-up time is restricted. BMC Med Res Methodol 19, 74 (2019). https://doi.org/10.1186/s12874-019-0696-x

Download citation

Received: 26 July 2018
Accepted: 26 February 2019
Published: 03 April 2019
DOI: https://doi.org/10.1186/s12874-019-0696-x

Two-stage optimal designs with survival endpoint when the follow-up time is restricted