Overreacting to patterns generated at random -- Part 1
My colleague Pat Whitcomb passed along the book Freakonomics to me earlier this month. I read a story there about how Steven D. Levitt, the U Chicago economist featured by the book, used statistical analysis To Catch a Cheat --teachers who improved their students’ answers on a multiple-choice skills assessment (Iowa Test). The book provides evidence in the form of an obvious repeating of certain segments in otherwise apparently-random answer patterns from presumably clueless students.
Coincidentally, the next morning after I read this, Pat told me he discovered a 'mistake' in our DX7 user guide by not displaying subplot factor C (Temp) in random run order. The data are on page 12 of this Design-Expert software tutorial on design and analysis of split plots. They begin with 275, 250, 200, 225 and 275, 250, 200, 225 in the first two groupings. Four out the remaining six grouping start with 275. Therefore, at first glance of this number series, I could not disagree with Pat’s contention, but upon further inspection it became clear that the numbers are not orderly. On the other hand, are they truly random? I thought not. My hunch was that the original experimenter simply ordered numbers arbitrarily rather than using a random number generator.*
I asked Stat-Ease advisor Gary Oehlert. He says "There are 4 levels, so 4!=12 possible orders. You have done the random ordering 9 times. From these 9 you have 7 unique ones; two orders are repeated twice. The probability of no repeats is 12!/(3!*12^12). This equates to a less than .00001 probability value. Seven unique patterns, as seen in your case, is about the median number of unique orders."
Of course, I accept Professor Oehlert’s advice that I should not concern myself with the patterns exhibited in our suspect data. One wonders how much time would be saved by mankind as a whole by worrying less over what really are chance occurrences.
*The National Institute of Standards and Technology (NIST) provides comprehensive guidelines on random number generation and testing– a vital aspect of cryptographic applications.
Coincidentally, the next morning after I read this, Pat told me he discovered a 'mistake' in our DX7 user guide by not displaying subplot factor C (Temp) in random run order. The data are on page 12 of this Design-Expert software tutorial on design and analysis of split plots. They begin with 275, 250, 200, 225 and 275, 250, 200, 225 in the first two groupings. Four out the remaining six grouping start with 275. Therefore, at first glance of this number series, I could not disagree with Pat’s contention, but upon further inspection it became clear that the numbers are not orderly. On the other hand, are they truly random? I thought not. My hunch was that the original experimenter simply ordered numbers arbitrarily rather than using a random number generator.*
I asked Stat-Ease advisor Gary Oehlert. He says "There are 4 levels, so 4!=12 possible orders. You have done the random ordering 9 times. From these 9 you have 7 unique ones; two orders are repeated twice. The probability of no repeats is 12!/(3!*12^12). This equates to a less than .00001 probability value. Seven unique patterns, as seen in your case, is about the median number of unique orders."
Of course, I accept Professor Oehlert’s advice that I should not concern myself with the patterns exhibited in our suspect data. One wonders how much time would be saved by mankind as a whole by worrying less over what really are chance occurrences.
*The National Institute of Standards and Technology (NIST) provides comprehensive guidelines on random number generation and testing– a vital aspect of cryptographic applications.
0 Comments:
Post a Comment
<< Home