Stats Made Easy

Practical Tools for Effective Experimentation

Monday, February 27, 2006

Musings from Mazatlan Mexico

I just returned from a week of surf and sun -- a welcome respite from the coldest spell this winter in Minnesota. I went from -15 to 85 F from takeoff to touchdown over the four hour flight south -- a span of 100 degrees! Mazatlan is home to the family of a young lady named Clarissa who spent a year with us as an exchange student -- joining my three daughters in our residence in Stillwater. (You cannot imagine the pileup in the bathroom each morning before school!) Clarissa frequented the local health food store where, if I had not forbidden it, she would have purchased a scary collection of unregulated medicines for losing weight. It's bad enough that the US Food and Drug Administration (FDA) provides no protection against the snake oils and such that are peddled to naive consumers, but south of the border in Mexico it is far worse. There, for example, many desperately ill Americans find "alternative" medicines and treatment, although as a colleague at Stat-Ease pointed out, either it is medicine or the alternative. I see in Mexico an unabashed belief in the supernatural, which I suppose is just as prevalent here in the USA, but relegated to tabloid and cable television features about haunted houses. Coincidentally I brought along for beach-reading the latest issue of American Scientist which included an article titled "The Cognitive Psychology of Belief in the Supernatural" that explains why humans harbor an innate faith in the unreal. I found this quote enlightening:

"...our brains have evolved so that science eludes us but religion comes naturally."

PS. This issue of American Scientist also features an article telling why "Three statistical strategies —- replicating, blocking and modeling —- can help scientists improve accuracy and accelerate progress." I agree! See Volume 94, Number 2, March-April 2006 .

Friday, February 17, 2006

Demo of Latin Square design for high school chemistry class

James N. Cawse of GE Global Research emailed me this week with this question.

"Mark, has anyone developed a demonstation/experiment suitable for high school students that illustrates a Latin Square design? I've been asked to give a high school chemistry class on combinatorial chemistry; as I thought about it, the simplest "combinatorial" type design is a Latin Square. It actually has a chance of being understood because of the current craze for Sudoku."

My response was:

"James, Funny you should mention Sudoku and Latin Squares because this morning I was thinking how the same structures apply to Latin Hypercube Designs (LHD) that are popular for DOE on computer sims, for example ones based on finite element analysis that GE uses for designing jet and power turbines. (An aside -- I just got the Jan-Feb issue of American Scientist featuring in their Computer Science column an article titled "Unwed Numbers" on the mathematics of Sudoku. This name stems from the Japanese firm Nikoli who called these puzzles "the numbers must be single" in the sense of being unmarried.) To answer your question, no I don't know of a classroom experiment that illustrates the Latin Square structure."

If any of you StatsMadeEasy blog readers knows of good in-class chemistry experiments that illustrate Latin Square or other principles of DOE, post a comment.

PS. I see that Wikipedia offers a very extensive entry for Sudoku that includes this comment about the mathematics of it:
"A valid Sudoku solution grid is also a Latin square. There are significantly fewer valid Sudoku solution grids than Latin squares because Sudoku imposes the additional regional constraint. "

Monday, February 13, 2006

Proof that sparsity of effects not a good assumption?

As a dues-paying member of American Institute of Chemical Engineers (AIChE) I got my Chemical Engineering Progress (CEP) magazine today -- the Feb 2006 issue. I see in the article "Designing Experiments for the Modern Micro Industries" that author Phillip H. Williams claims that in his semiconductor industry the processes are so complex that engineers canNOT assume that the sparsity of effects principle* rules. He then supports this contention by showing numerous three-factor interactions (3FIs) from Minitab software analysis of a full 32-run two-level factorial on 5 factors. After realizing that Table 1 showing response data got out of standard order (1,2,3,...19,20,21,24,22,23,25,26,27...32), I got Design-Expert version 7 software to agree with Williams' results. However, from the Box-Cox plot it is evident that an inverse square root transformation helps. Oh, and by the way, Williams uses the relatively risky p-value of 0.1 as the cut-off for significance. As a practical matter I would say that main effect predominate as predicted by sparsity of effects. However, it appears Williams does have some basis for saying that this principle fall down in his case, which produces a number of apparently significant 3FI's. Nevertheless, I am not swayed (as he is) from the advice (quoted from the article) that "if you have five factors ... never do the full factorial since the 2^5-1 is a resolution-five design." This still makes sense to me -- why do 32 runs when 16 will normally do. I will email my DX7 file to anyone interested in playing with this data.

*From Wikipedia: "The sparsity of effects principle states that a system is usually dominated by main effects and low-order interactions. Thus it is most likely that main (single factor) effects and two-factor interactions are the most significant responses. In other words, higher order interactions such as three-factor interactions are very rare."

Sunday, February 12, 2006

Rambling on about hockey sticks and global warming

Being at Excel Center rink to see the Minnesota Wild clobber the Kings of LA at their game last week in Saint Paul and then watching the USA women hockey team beat up their first two foes, Switzerland and Germany, at the Turin (aka Torino) Olympics got me thinking about the technology of hockey sticks. I have fiddled with small scale hockey sticks for in-class experiments* but I wondered if anyone had posted technical detail on the real thing, so I searched the internet. I was very surprised to get numerous hits on global warming! According to BBC News "The hockey stick was a term coined for a chart of temperature variation over the last 1,000 years, which suggested a recent sharp rise in temperature caused by human activities." But then I came across numerous web entries disputing the "hockey stick" as an artifact of principal component analysis, or PCA, which evidently due to improper normalization procedure tends to emphasize any data that do have the hockey stick shape, and to suppress all data that do not. For one detailed opinion on all this, see Being a native of what we Minnesota residents often call the "State of hockey" and a fan of statistics, I find this all very interesting. I do not care to get snowed under with arguments about global warming, but I have put off plans to buy new hockey skates this year -- it was too warm this year to produce good ice at the local outdoor public rink. However, the record temperatures here in Minnesota did not stop the 27th Annual International Eelpout Festival this weekend at Leech Lake. The sheriff did prohibit fishermen from driving their SUVs out on the ice. Earlier this season a bunch of ice anglers lost their vehicles after parking them too close together. SCUBA divers fished the SUVs out from under the ice. Perhaps if the windows were left open they might have trapped some fish hoping for a joy ride.

*See "Tabletop Hockey Meets Goals for Teaching Experimental Design"

Wednesday, February 08, 2006

Stat on blogs

The article "Corporate Blogs for Technology Businesses"* published by says that according to Technorati, a popular 'blogosphere' search engine, a new blog is created every second. Here are other stats from the article:
- 16% of U.S. adults (32 million) are blog readers, a 58% increase over 2004 (from "The State of Blogging" by Pew Internet & American Live Project)
- 6% of American adults has created a blog (11 million people)
These figures stem from early to middle 2005, so by now they may be considerably higher based on what I am seeing in the media -- a huge buzz on blogs.

*Authored by David Meerman Scott, whose blog is

Tuesday, February 07, 2006

Neat things seen in Boston -- a top town for techie types

I flew out from Stat-Ease headquarters in Minneapolis last Wednesday for a dinner meeting with two fellows from UK in the States for a conference in Boston -- sort of a half-way point for all of us. The business part was successful, but the fun things for me were:
1. The lean, mean coffee-making machine in my hotel room -- CV1 (tm) by Hamilton Beach. The ground coffee is contained in a disposable tray which you insert into the machine. It drips directly into a disposable styrofoam cup. The hotel was kind enough to provide several coffee packages for caffiends like me. FYI, see
Lodging Magazine article
Funny Business blog on CV1
2. In the hotel gift shop a kiosk featuring the "Stikky" series of how-to softcover books with "lay-flat" binding, such as interpreting stock charts for wannabe investors. As co-author (with Pat) of two "Simplified" books, the idea of something "stikky" is intriguing. See Stikky book web site

Thursday, February 02, 2006

Montgomery's Textbook "Design and Analysis of Experiments"

Example 8-6. After using a log transformation Montgomery examines the normal probabilty plot of the residuals and says: "This plot is suggestive of slightly heavier than normal tails, so possibly other transformations should be considered."

The Box-Cox plot for Montgomery's model (A, B, AD) suggests an inverse transform which seems to clear up all the residual plots and clarify the model. Give it try.