The problem: In recent years high throughput methodologies and the accumulation of patients’ information and health histories in big databases have dramatically increased the number of potential discoveries scanned during a typical experiment. Obviously, no such study can avoid some selection at the stage of analysis, discussion or summary: no person can digest the results of such an experiment without direct or indirect selection of those inferences being highlighted. Statistical inference following selection (selective inference) is thus unavoidable.
However, in medical research not aimed for drug registration, the danger to replicability posed by selective inference in view of multiple questions is rarely addressed. An examination of a sample of 100 papers from the leading New England Journal of Medicine (2000-2010) revealed that 79/100 studies had no multiplicity adjustment at all, even though all needed it in some form or the other because multiple endpoints were analyzed. Even the other 21 did not address the full scope of multiplicity present, ignoring subsets analysis and such. Without addressing the selection problem Soric and Ioannidis’s warnings that ‘most research findings may be false’ becomes appropriate (Soric 1989, Ioannidis 2005). See Jager & Leek for a recent data based study of the problem and our discussion of this work.
Our approach:There is a vast gap between what is required and done in research for regulatory purposes, where multiplicity issues are strenuously addressed, and the practice of researchers. The reason may be that the available methods designed for regulatory purposes are needlessly conservative. Less conservative methods, specifically tailored for medical research and the characteristics of clinical trials, should be developed
Our solutions: A first effort towards a solution is currently being developed in this project, where we design a weighted hierarchical FDR controlling procedure that treats differently the primary endpoints and the secondary endpoints. These issues are also discussed in the context of the European open data initiative for clinical trials.