Stats Made Easy

Practical Tools for Effective Experimentation

Monday, September 29, 2008

Round and round on how to round

Tom Murphy and Peter Fortini recently published a great answer to the question of how many significant digits to use when reporting test results relative to manufacturing specifications.* All engineers (such as me) know not to round the intermediate results of a multistage calculation. Nevertheless, it's good to be reminded of this. However, I was unaware that, when rounding a test result for reporting purposes, the interval should be between 0.05 and 0.5 sigma. Murphy and Fortini offer the example of a test result of 1.45729 with a standard deviation of 0.00052, which leads to a rounding of 1.457 (the nearest thousandth). That's good to know!

I guess that I’ve been slacking off on this rounding deal because I was also ignorant of the “five-even” rule that these two authors note as being de rigueur for “most standards for science and technology.” For example, this rule causes 98.5 to be rounded down to 98, whereas 99.5 gets round up to 100.

My informal survey of math-literate acquaintances revealed that most had learned only to round 5 and higher up (and 4 and lower down). However, my son Hank, a programmer by profession, was familiar with the five even rule, which this Wikipedia entry on rounding says is also known as “statistician's rounding.” That makes sense because when dealing with large sets of scientific data, where trends are important, traditional “five-up” biases the data upwards.

When I worked in R&D, I noticed that my fellow engineers seemed to be scared to death of rounding – even when reporting their results to non-technical management – marketing folks and the like. Reporting data to a dozen decimal places generally blunted their spear, whereas rounding their numbers to no more than three significant digits would have made their point a lot sharper. Isn’t that ironic?

* “Reporting Test Results, Determining Significant Digits and Rounding Properly,” ASTM Standardization News, September/October 2008 (link for article content may require subscription )

Saturday, September 20, 2008

ahRrrrggg-Squared – Talk Like a Pirate Day

Yesterday, which happened to be Talk Like a Pirate Day, I did a pro bono webinar for a crew of food scientist students assembled by their teacher Tyre at North Carolina State University . They are located in Raleigh – not far from where the notorious Blackbeard hung out in his hay days. Evidently he hosted some very wild parties with his bloodthirsty cohorts, as detailed at this Pirates Realm . (Their webmaster warns that copyright “thieves shall be gullied and fed to the sharks!”)

Tyre and his NC State crew concocted a punch that purportedly imitated an orange drink similar to Kool Aid® -- a brand of artificially-flavored drink mix now owned by the Kraft Foods Company but originally invented by Edwin and Kitty Perkins of Hastings, Nebraska.

Now I know that no Carolinian would touch such a tame Midwestern beverage. Thus I strongly suspect that NCSU keeps at least a firkin of rum handy for their apprentice galley slaves – oops, I meant to say food scientists. I am thinking that rum may be the principal component in the mysteriously unidentified “flavor” in the recipe sent to me by Tyre. Given that the a proper rum-laced pirate grog often included lime juice to help to stave off scurvy and a measure of cane sugar to help kill the bitterness of the water, it stands to reason that this NCSU “orange drink” also contains citric acid and sucrose. However, being as I was without a spyglass for this webinar, who knows what this piratical Carolinians were up to.

The treasure these tasters seek is 5 on a 0 to 10 intensity scale. Notice on the graph how they favor this so-called “flavor.” As a statistician turned pirate I say ahRrrrggg-Squared to that. They’d best send me a hogshead of this so-called “kool-aid” or I will be forced to send Tyre the Black Spot in lieu of my usual report.

Sunday, September 14, 2008

Battle with the Black Box

Jim Alloway, founder of EMSQ Associates, dropped off a prototype of his Black Box Simulator for me to experiment on the other day. If you look Jim up on the New York State part of this speaker list of statistics experts, you will see that he specializes in design of experiments (DOE) and process management. What I like about Jim is that, despite achieving a PhD and teaching at the university level, he never lost his love for toys.

The Black Box is an ingenious idea for teaching DOE via a hands-on exercise – far easier than other approaches like catapults, trebuchets, paper helicopters, or golfing toys (been there and done that as you can see via the links). In less than half an hour I experimented on the upper left 'sextant' of the Black Box. Originally I'd planned to get help from my son Hank, but I discovered it was easy enough just to do myself. I think it's a blast!

What's great about doing an actual (not simulated) experiment is running into practical issues of having to do pre-experimental range-finding, dealing with measurement issues (two different scales on the ruler, where to measure too, how hard to push down, etc) and so forth. Other aspects are more subtle, such as the difficulty when running an experiment to not look at the prior result of a replicated run and cheat on making each one match. For example, I swear that I did not cheat on the repeats, but maybe I did unconsciously, because so many agreed exactly. Also, I realized when talking with Jim afterwards that I misread the 64ths scale as 60ths! Doh!!! (For the record, I corrected the numbers.)

I set up a 2^2 (two-level factorial) with 3 center points in a fully-replicated, blocked design -- see results attached. Just for fun, I tried analyzing the first block -- very educational -- it reminded me not to try analyzing an unreplicated 2^2! (Four runs provide nothing for statistical testing unless one makes the dangerous assumption that the two-factor interaction (2FI) effect must be a measure of experimental error.) As shown by the 3D surface, my 2FI model fell short of the center points (notice how they all 'lollipop' up) – thus the ANOVA revealed significant curvature (p = 0.0001)as evidenced by the center points .

I gave Jim back his Black Box before I could probe its mysteries any further by augmenting my initial experiment design into a response surface method (RSM), for example by simply checking the centers of the edges of the square region. Jim says that he hopes to go into production with his Black Box by year end. At that time he will offer us one to evaluate for our training. Then my battle with the Black Box can be continued.

Sunday, September 07, 2008

Fantasy football stats tracked with great interest -- $100s of millions worth

Tomorrow night my Minnesota Vikings kick off their National Football League season against their biggest rival – the Green Bay Packers of our neighboring State of Wisconsin. Dubbed the “Border Battle,” this game creates a civil war where I live – only a few miles from the line where Packer-mania runs rampant. I will watch the game with my son-in-law, who hails from Wisconsin and bleeds Packer green. Neither one of us is shy about rubbing in a victory by our favorite club, which will no doubt be the Purple, due to the hated Packers being without Brett Favre at quarterback.

However, with the huge interest in fantasy football, many fans pay more attention to stats than the game outcomes. In some cases they end up rooting against their home team and for an opposing player that could earn them significant prize money in a fantasy league. A year ago, the Vikings rookie running back Adrian Peterson (“AP”) broke out with a single-game NFL rushing record. CNBC Sports Biz blogger Darren Bovell estimated that fantasy team-owners who picked up AP earned $600 million from his stellar 2007 season.

For my chapter on “Extrapolation Can Be Hazardous to Your Health” in RSM Simplified, I analyzed quarterback sacks – a component in most fantasy scoring systems (more the better for your defensive team). Based on attributes collected for 167 defensive players in the 2002 season who got at least one sack, my regression analysis predicted that the ideal sacker would be a 7-footer, weighing only 100 pounds, who will produce over 60 sacks per year! These fanciful figures, generated by applying statistical tools incorrectly (that was my point!), better describe an overgrown Velociraptor than a human being (except possibly for famed Cowboy sacker Ed “Too Tall” Jones).

Who knows – this could be the shape of things to come via genetic engineering done for the sake of sports. Meanwhile, I am banking on our new Purple People Eater -- 2007 sack leader Jared Allen, who is very tall at 6’-6,” but weighs an appreciable 270 pounds. I’d put him up against a Velociraptor (provided he gets to wear all his football gear – helmet and all!).