Latin Squares in Practice and in Theory I

3. The statistical analysis of a latin-square experiment

Ronald A. Fisher realized that latin squares could be abstracted from the partition of growing plots and applied to the elimination of systematic error in a much more general context.

In the 2-dimensional plot of land, the systematic error due to variation in soil, etc. can be minimized by a suitable latin square partition of the plot. More generally, whenever there are two independent factors that may introduce systematic error into an experiment, a latin square arrangement in ``experiment space'' can compensate for these errors. (Fisher also showed how graeco-latin and ``hyper graeco-latin" squares could be applied to more complex experiments; see Fisher).

The following example is adapted from what was D. H. Kim's Stat 470 website at the University of Michigan. The data set is taken from The Design and Analysis of Experiments by Douglas C. Montgomery (Wiley).

The experiment is to study the burning rate of five different formulations of a rocket propellant. The formulations are mixed from raw material that comes in batches whose composition may vary. Furthermore, the formulations are prepared by several operators, and there may be differences in the skills and experience of the operators. So in this experiment there are two presumably unrelated sources of systematic error: different batches and different operators.

To compensate for these systematic errors by a latin square design, five operators are chosen at random, and five batches of raw material are selected at random, each one large enough for samples of all five formulations to be prepared. A sample from the one of the five batches (labelled at random I, II, III, IV, V) is assigned to one of the five operators (labelled at random 1, 2, 3, 4, 5) for preparation of one of the five formulations (labelled at random A, B, C, D, E) according to the following latin square arrangement; the table also contains the observed burning rate for that formulation of that sample.

Operator

Batch

	1	2	3	4	5
I	A 24	B 20	C 19	D 24	E 24
II	B 17	C 24	D 30	E 27	A 36
III	C 18	D 38	E 26	A 27	B 21
IV	D 26	E 31	A 26	B 23	C 22
V	E 22	A 30	B 20	C 29	D 31

The calculational scheme used to analyze these data goes as follows.

Normalization: The average of all the observations is 25.4. We first normalize by subtracting this average from each observation. This gives a new set of data with average 0:

1 2 3 4 5

I -1.4 -5.4 -6.4 -1.4 -1.4

II -8.4 -1.4 4.6 1.6 10.6

III -7.4 12.6 0.6 1.6 -4.4

IV 0.6 5.6 0.6 -2.4 -3.4

V -3.4 4.6 -5.4 3.6 5.6

Separation of signals: We want to think of these data as representing the superposition of four signals:

Effect of batch: the matrix of row averages
Effect of operator: the matrix of column averages
Effect of formulation: the matrix of A, B, etc. averages
Nonsystematic error: whatever is left

Accordingly we write the normalized data matrix as the sum of four matrices:

-3.2 -3.2 -3.2 -3.2 -3.2

1.4 1.4 1.4 1.4 1.4

0.6 0.6 0.6 0.6 0.6

0.2 0.2 0.2 0.2 0.2

1.0 1.0 1.0 1.0 1.0

+

-4.0 3.2 -1.2 0.6 1.4

-4,0 3.2 -1.2 0.6 1.4

-4.0 3.2 -1.2 0.6 1.4

-4.0 3.2 -1.2 0.6 1.4

-4.0 3.2 -1.2 0.6 1.4

+

3.2 -5.2 -3.0 4.4 0.6

-5.2 -3.0 4.4 0.6 3.2

-3.0 4.4 0.6 3.2 -5.2

4.4 0.6 3.2 -5.2 -3.0

0.6 3.2 -5.2 -3.0 4.4

+

2.6 -0.2 1.0 -3.2 -0.2

-0.6 -0.3 0.0 -1.0 4.6

-1.0 4.4 0.6 -2.8 -1.2

0.0 1.6 -1.6 2.0 -2.0

-1.0 -2.8 0.0 5.0 -1.2

The Null Hypothesis. The entries in the third matrix represent the signal we are looking for: the differences between the five formulations being tested. The entries in the fourth matrix represent errors that cannot be accounted for by operator or batch effect. The significance of the experiment depends on the relations between these two sets of numbers. More precisely, we suppose that there is no formulation effect (``the null hypothesis'') and estimate the probability of a set of numbers like that in the third matrix being observed. This calculation is made using what can be called the basic axiom of statistics: Nonsystematic error is normally distributed.
Analysis of variance. This is based on the following observation, adapting Fisher's words to this context: ``On the null hypothesis the mean squares for formulation and error have the particularly simple interpretation that each may be regarded as an independent estimate of the same single quantity, the variance due to error of a single observation.''
- The sample variance represented by the third (formulation) matrix is the sum of the squares of the entries divided by the number of independent entries (``number of degrees of freedom''), which in this case is 4 since same-color (same formulation) elements are identical, and each column sums to zero. The sample variance is s²_f = 82.5 .
- Similarly the fourth (error) matrix has 12 degrees of freedom: it has 25 entries; but the entries must sum to zero; the top four rows must each sum to zero (the fifth is then automatic); the first four columns must each sum to zero, and the first four same-color sets must each sum to zero, a total of 13 constraints. The sample variance is s²_e = 10.66 .
- It is remarkable that the ratio of two such sample variances is a random variable with a known distribution (here is where our ``basic axiom'' is used). Fisher and Yates call it the ``Variance Ratio'' distribution; it is now called F_k,n, where here k = 4, n = 12. Interpolating from the table in Fisher and Yates (Table V) or in a suitable text (for example Box, Hunter and Hunter) shows that the probability of that ratio being as large as or larger than the value in this example (82.5/10.66 = 7.74) is 0.28%.
- This analysis shows that if the null hypothesis were true, the experimental data would be extremely unlikely. On this basis we reject the null hypothesis, and report that the experiment has detected a difference between formulations, statistically significant at the 0.0028 level.

	1	2	3	4	5
I	-1.4	-5.4	-6.4	-1.4	-1.4
II	-8.4	-1.4	4.6	1.6	10.6
III	-7.4	12.6	0.6	1.6	-4.4
IV	0.6	5.6	0.6	-2.4	-3.4
V	-3.4	4.6	-5.4	3.6	5.6