Solved – How to use random seed appropriately

I am writing a paper where I will analyze 6 designs. Responses for each design can be sampled from the main dataset. My question is that do I set 1 random seed for each sampling or 1 random seed for whole and sample them in the order of the design?
A) set random seed(1) and then sample design I,
set random seed(2) and then sample design II, …
B) set random seed(1) and then sample design I, design II … design VI?

I recommend setting a separate seed each time you analyze a new design.

Why? For convenience and debugging. When you debug design IV and set a seed for this specific design, you can debug it easily. If you have set a "global" seed, then each time you want to do something to design IV that relies on a specific random number, you first have to run though designs I-III.

In addition, this allows you to add designs IIa, IIIa and IIIb at their places without disturbing the replicability of the results you already got for design IV. With a "global" seed, you would need to add designs IIa, IIIa and IIIb at the end of your code, at the cost of legibility.

Also: try your experiments with different seeds. If the conclusions change appreciably depending on the seed, something is likely broken, e.g., you may need more data to start with. Thus, if the answer to this question makes a substantive (as opposed to ease of programming related) difference, you are doing something wrong. See also If so many people use set.seed(123) doesn't that affect randomness of world's reporting?

Similar Posts:

Rate this post

Leave a Comment