# Solved – Linear regression sample size advice

I'm involved in sample size calculation for an oncology clinical trial. Our Primary outcome is quality of life (quantitative, normal).
There are two treatment regimes. Therefore I started with a simple t-test sample size calculation. With an effect size of 0.55, pow=.8, alpha=.05 we need overall 106 participants. Considering lost to fup and death before QOL measurement accrual size rise to 152 participants.

Trial is multicentre, with randomization stratified by centre. Patients can have different type of tumor (nsclc, cholangiocarcinoma, stomach, pancreatic)

The funding institution referees requested to (freely translating) “increment sample size to account heterogeneity of patients involved". They did not request stratified randomisation center*tumor (maybe due to likely imbalance for a small sample size), eventually suggested a minimization.

I'm starting to think about a linear model like the following, to fullfill requests

QOL = f(Treatment.dummy, TumorType.dummies)

Typically effect size relies on the R^2 with and without the covariate of interest. However given the state of the literature, i don't know how to make an educated guess about this stuff.

The only idea that remains (at least to me) is to simulate with Monte Carlo. In the end it would be an F test of model with vs without Treatment.dummy.

Given a certain n (growing beetween simulation steps):

• I would simulate proportion of different tumor patients recruited, given infos from our clinical database.
• Given the tumor type I would simulate proportion of patients randomized to Treatment in that strata (…thinking about Uniform with mean .5 … eg U(0.35,0.65)
• For pts in control group, given tumor type, i would simulate QOL given normative data of the instrument
• For pts in treatment group, given tumor type, i would simulate QOL given normative data of the instrument AND effect-size of interest

Then do the regression model, the test and get the power.

Another way would be to start from the t-test sample, and then increment it given models sample size rule of thumbs (eg 20 pts per added variable); but I don't like it very much because you loose grip with power-analysis.

Any other approach? Any suggestion (even “yes, do the Monte Carlo simulation”) would be very appreciated.

Contents