This item is a good anecdote to the recent item on the precognition claims recently made by scientists. It suggests that the theoretical methodology was not tight enough and this is actually way more common than even scientists want to suppose. Bias does creep in and the data sets are always a way too small to be very comfortable.
The obvious problem is separation. This type of test shows a small separation and that means a little bit of data error can kick results all over.
It is not decisive such as a simple test I conducted to show efficiency of a product in promoting burn healing. In that case all treated burns sitting next to untreated burns showed major improvement. There were no overlapping signals at all and any such would have had me go back to the drawing board.
Thus a lack of decisive separation means generally that one must increase the sample size by an order of magnitude or two.
At least the paper is suggestive and supports an effort to fund a much larger study. It also provides a protocol that could actually winkle out scientific bias in this work. Trying to predict bias as part of the experimental design must be helpful.
Precognition experiments show that academic standards of evidence are too low
NOVEMBER 22, 2010
Re-examining the statistical methods used in the studies of precognition. Conclusions should not be drawn from a series of 50-50 tests and then to sift the data for anomolies. That might be the first part but then you have to run tests and analysis to confirm the anomolies.
In eight out of nine studies, Bem reported evidence in favor of precognition. As we have argued above, this evidence may well be illusory; in several experiments it is evident that Bem’s Exploration Method should have resulted in a correction of the statistical results. Also, we have provided an alternative, Bayesian reanalysis of Bem’s experiments; this alternative analysis demonstrated that the statistical evidence was, if anything, slightly in favor of the null hypothesis. One can argue about the relative merits of classical t-tests versus Bayesian t-tests, but this is not our goal; instead, we want to point out that the two tests yield very different conclusions, something that casts doubt on the conclusiveness of the statistical findings.
Although the Bem experiments themselves do not provide evidence for precognition,
they do suggest that our academic standards of evidence may currently be set at a level that is too low.
It would therefore be mistaken to interpret our assessment of the Bem experiments as an attack on research of unlikely phenomena; instead, our assessment suggests that something is deeply wrong with the way experimental psychologists design their studies and report their statistical results. It is a disturbing thought that many experimental findings, proudly and confidently reported in the literature as real, might in fact be based on statistical tests that are explorative and biased. We hope the Bem article will become a signpost for change, a writing on the wall: psychologists must change the way they analyze their data.
Post a Comment