Design of Experiments
I am in to designing a modeling software in my current work. In relation to that, I had to review some concepts relating to Design of Experiments. This can be a potent concept while carrying out real life experiments in the lab.
I just dont remember any of the experiments I did in my Under grad. May be they were just single run experiments and hence did not require a DOE as such. Sometimes when I look back, I really feel that the experiments and the education related to it that I received was was far removed from the real world scenario. May be surveying that I learnt in my engineering came any where close to making one understand the real difficulties of experimentation and drawing conclusions.
The next time I came across DOE was during a training session , when I was working in the Quality department of an MNC. However I never got a chance to carry out the DOE in terms of a real life application. Today when I review the fundas, the concepts are definitely robust and take in to consideration a lot of aspects. One thing I am hungry of is to work on a real life experiment and use the concepts of DOE.
Anyway, here are some basic aspects of DOE :

Model a continuous Dependent variable Y which depends on a set of continuous and discrete variables. Uncontrolled factors such as machines, week of the day, etc can have an effect on the experiment outcome. Hence blocking designs are incorporated so that blocking factor has minimal impact on the experimental outcome.
Y is continuous
Objective is to fit a linear or quadratic form of equation between Y and X’s.
Linear Model generates main effects and interaction effects graphs
Quadratic form generates response surface graphs which can be used to study the dependence
Common Designs
Comparative Designs
Completely Randomized Designs
Blocked Randomized Designs
Screening Designs
Full Factorial
Fractional Factorial
Plackett-Burman designs
Advanced Modeling
Response Surface Modeling
Regression Modeling
Main Steps in a DOE
- Set objectives
- Select process variables
- Select an experimental design
- Execute the design
- Check that the data are consistent with the experimental assumptions
- Analyze and interpret the results
- Use/present the results (may lead to further runs or DOE’s)
Step 1: Objectives
The first step is to decide is the type of DOE. Is it going to be a sequential / Iterative DOE? It is not necessary to complete the experiment in one go. Iterative approach is the one which is followed widely
Assumptions in DOE:
Are the measurement systems capable for all of your responses ?
- Measurement systems in place for Y
Is your process stable?
- Use run charts to gauge this
Are your responses likely to be approximated well by simple polynomial models?
- If not, its useful to break up the experiment in to separate experiments
Are the residuals (the difference between the model predictions and the actual observations) well behaved?
- Residuals need to be (roughly) normal and (approximately) independently distributed with a mean of 0 and some constant variance.
Tests for Residual Normality
Histograms
Normal Probability plots
Independence of residuals over time
Independence of Residuals from Factor Settings
Plot of Residuals Versus Corresponding Predicted Values
Step 2: Select Process Variables
Process variables include both inputs and outputs - i.e., factors and responses. The selection of these variables is best done as a team effort. The team should
- Include all important factors (based on engineering judgment).
- Be bold, but not foolish, in choosing the low and high factor levels.
- Check the factor settings for impractical or impossible combinations - i.e., very low pressure and very high gas flows.
- Include all relevant responses.
- Avoid using only responses that combine two or more measurements of the process. For example, if interested in selectivity (the ratio of two etch rates), measure both rates, not just the ratio.
Step 3: Select Experimental Design
There are a whole host of designs that one can choose from. It depends on a number of aspects like factors, levels, budget and time constraints
Completely randomized designs
Randomized block designs
Latin squares
Graeco-Latin squares
Hyper-Graeco-Latin squares
Full factorial designs
Two-level full factorial designs
Full factorial example
Blocking of full factorial designs
Fractional factorial designs
A 23-1 half-fraction design
How to construct a 23-1 design
Confounding
Design resolution
Use of fractional factorial designs
Screening designs
Fractional factorial designs summary tables
Plackett-Burman designs
Response surface (second-order) designs
Central composite designs
Box-Behnken designs
Response surface design comparisons
Blocking a response surface design
Adding center points
Improving fractional design resolution
Mirror-image foldover designs
Alternative foldover designs
Three-level full factorial designs
Three-level, mixed level and fractional factorial designs
Description:
Completely randomized designs
As the name suggests, randomize the number of runs which equals = Factors * levels*Replications
Randomized block designs
When there is some nuisance factor which affects the experiment, one can use the randomized block design to over come the interference of the factor. There is a single factor of primary interest, typically called the treatment factor, and several nuisance factors. For Latin square designs there are 2 nuisance factors, for Graeco-Latin square designs there are 3 nuisance factors, and for Hyper-Graeco-Latin square designs there are 4 nuisance factors.
Full factorial designs
If there are k factors, each at 2 levels, a full factorial design has 2k runs. The design consists of replication, randomization and center points
Fractional factorial designs
Choose a fraction and then decide to run that fraction of total number of runs possible. The way to go about deciding which runs to include is of prime importance. Whatever the strategy one chooses in the definition phase, it is better to choose a design which is balanced and orthogonal.
Plackett-Burman designs
Screening Design where only main effects are of importance
Step 4: Analysis of DOE Data
Look at the data. Examine it for outliers, typos and obvious problems. Construct as many graphs as you can to get the big picture.
Response distributions (histograms, box plots, etc.)
Responses versus time order scatter plot (a check for possible time effects)
Responses versus factor levels (first look at magnitude of factor effects)
Typical DOE plots (which assume standard models for effects and errors)
Main effects mean plots
Block plots
Normal or half-normal plots of the effects
Interaction plots
Sometimes the right graphs and plots of the data lead to obvious answers for your experimental objective questions and you can skip to step 5. In most cases, however, you will want to continue by fitting and validating a model that can be used to answer your questions.
Create the theoretical model (the experiment should have been designed with this model in mind!).
Create a model from the data. Simplify the model, if possible, using stepwise regression methods and/or parameter p-value significance information.
Normal plot of all the effects: All the effects thrown out by regression are clearly seen to be away from the normal distribution graph. This is another way to validate the effects that one is considering in the model
Test the model assumptions using residual graphs.
If none of the model assumptions were violated, examine the ANOVA.
Simplify the model further, if appropriate. If reduction is appropriate, then return to step 3 with a new model.
If model assumptions were violated, try to find a cause.
Are necessary terms missing from the model?
Will a transformation of the response help? If a transformation is used, return to step 3 with a new model.
Use the results to answer the questions in your experimental objectives – finding important factors, finding optimum settings, etc.
Software that can be used to get one’s hands dirty in using and applying DOE: camo.com
Hope some day I will utilize this knowledge that I have gained over the last one week.