ezANOVA Cognitive Neuroscience 

Contents:

Introduction:

ezANOVA is a free program for analyzing data. I developed this program for a statistics course I taught. It is not a particularly powerful tool, but it is useful for illustrating how the basics of Analysis of Variance. You can download this software from the web. The software is available either as a Zip-compressed file or a raw executable. You only need to download one of these files:

 

Between-groups ANOVA

As a first example, consider that we collect data examining how someone's environment influences their typing speed. We measure the words-per-minute a participant can type. Each of the 15 participants is tested in only one of three conditions: in a room where Bach's classical music is played, in a room where contemporary rock and roll is played or in a silent room. Hypothetical data might look like this:

  Sound        
  Bach   Rock   Silent
Alice 48 Nancy 47 Ray 64
Bob 40 Carl 38 Andy 44
Donna 31 Karen 32 Heidi 41
Nick 26 Tom 33 Emily 32
Sandra 58 Betty 55 Sally 59

Lets now analyze this data with ezANOVA. Start by launching the program (double-click on the ezANOVA icon). Then press the 'design a new experiment' button. You will be shown the design window like the one shown by the red '1' in the figure on the below.

At this point you will see the data entry window (shown by the red '2' in the illustration). Enter the values for each particpant by clicking on the cell in the spreadsheet and typing the observed typing speed, as shown. Note that the first column is labeled "Bach" - you want to enter the five values for people who typed in the room with classical music into this column. You probably want to save your hard work to disk, so you can re-analyze the data any time you want. Choose 'Save' from the 'File' menu to save your data to disk.

When you are done entering values, press the 'Sigma' button. You will be shown the results window (shown by the red '3' in the illustration). Note that the top of the window shows an ANOVA results table. The factor 'Sound' is not significant (the "p<0.5672" suggests that the effects observed could likely be expected by chance).

The lower portion of the results window shows you the descriptive statistics: the mean, standard deviation (StDev), variance (var), number of observations (N), Skew and Z-Soce for the Skew (zSkew) are all shown. Choosing 'Copy' from the 'Edit' menu will allow you to copy these results to Excel or any other program for drawing graphs.

Within-groups ANOVA

In the previous (between groups) example, each participant was only tested in a single condition. For example, Alice was only tested listening to Bach, and in no other situations. Note that some people are much better typists than others. Therefore, individual difference may be adding a lot of variance to our data. Since we only tested 5 people in our study, it has very little statistical power to find real effects. One way to increase the statistical power is to test the same individuals in each condition. That way, we can take into account the participant's overall typing speed.

Again, lets consider a hypothetical experiment. The design is similar to the between-subjects condition, except that we only test 5 people, and each person is tested in each condition. The data is shown below. Note that Nick is pretty slow in all conditions (never typing faster than 33wpm) while Sandra is generally a fast typist (never slower than 55wpm).

  Sound    
  Bach Rock Silent
Alice 48 47 64
Bob 40 38 44
Donna 31 32 41
Nick 26 33 32
Sandra 58 55 59

Analyze the data exactly as the between-subjects data, except make sure that the 'repeated design' checkbox is checked in the design window. Note that if you have already analyzed the between-subjects data described above, you can simply open that dataset and choose 'Describe design' from the 'Data' menu of the data entry window and switch on the 'repeated design' checkbox. Notice when you enter the data in this design you need to make sure that each row of data refers to the same participant. In our example, the first row of data are the typing speeds for Alice, the second row are the speeds for Bob, etc.

Once you have described the design and entered the data, choose 'Calculate ANOVA' from the Data Entry Window's 'Data' menu. The resulting results should look similar to those presented in this figure:

Inspection of this figure shows that there is a significant effect of the sound in the room of the typist (the ANOVA reports a P value of "0.0311", indicating that there is only a 3% chance this finding is due to random noise alone). Because the ANOVA looks at all 3 levels of your experiment, you probably want to look at the pairwise comparisons (repeated measures t-tests) to see what is influencing the effects.

Multifactorial ANOVA

One of the powerful aspects of ANOVA is that you can tease apart how different factors influence your data. My software allows you to analyze up to three factors, with either between or within factor designs.

Lets consider a hypothetical experiment where we are interested in how time of day as well as consumption of coffee in the morning influence exam scores. We have 20 participants, half of whom are tested in the morning and half are tested in the afternoon. Half of the members of each of these groups is given a caffeinated coffee in the morning, while the other half receive a decaffeinated coffee. Therefore, there are a total of two factors (time of test and type of coffee) each with two levels (AM versus PM test time, caffeinated versus decaffeinated coffee). Note that each individual only takes the test once (it is a between groups design) and that the independent measure is the score on the exam.

The figure shows each stage for analyzing this data. We begin by using the Design window to describe setup of this experiment. Next we use the Data entry window to report the scores for each participant in each condition. Finally, we can choose 'Calculate ANOVA' from the 'Data' menu to see the results of an ANOVA. In this example, we find that the main effect of time of day has no effect, nor does the main factor of the type of coffee the participant drank. However, we do find an interaction. In this case looking at the pairwise comparisons we can observe that those who drank caffeinated coffee performed better in the morning but worse in the afternoon than their counterparts who drank decaffeinated coffee.

Graphing results

This software can also show you a graphical image of your data. Once you are in the results window, you can choose 'Line Graph' from the 'View' menu. You will be shown a graph of the mean results for each condition with confidence interval error bars. You can customize the appearance of the graph (e.g. choosing the font and data range). In fact, you can copy or save these images to disk (in the standard '.EMF' format) so you can edit the images with Microsoft Word or many other programs. This graph is a vector-based graphic, so it should not appear jaggy if you print out the image.

Note that you can only make graphs for one or two-factor ANOVAs, and that this graphing tool is fairly basic. Another option is to copy the data from the Results Window into a program that has more powerful tools for generating graphics (e.g. Microsoft Excel).

Notes

This software is provided as is. The software was designed for instructional purposes only. The listing below describes some of the terms used by this software.

Term Notes
Arcsin
This data transform is often applied to ratio (e.g. data with values 0..1) or percent data (data with values 0..100). In these cases, the data often has both floor (scores can not be below 0) and ceiling (scores can not be more than 100%) effects. The Arcsin transform often makes this type of data more suitable for analysis with ANOVA (which assumes a normal distribution of data). Essentially, the arcsin transform recognizes that the difference between scoring 99% and 95% on a test is typically greater than the difference between 59% and 55%. ezANOVA's transform is equivalent to using the Excel formula "=(2*ASIN(POWER(value,0.5)))/PI()", where 'value' is in the range 0..1.
Between-subject design Also called a 'completely randomized design'. Each participant is only involved with one condition of the experiment. Compare to within-subject designs.
CI95% The confidence interval (CI) predicts the location of the population mean. For example, an observed mean of 12 with a CI95% of 2 means that the we believe there is a 95% probablity that the population mean lies between 10 and 14. Confidence intervals are useful error bars for graphs, as they give the viewer a sense of the variability for each mean. The CI is calculated with the formula CI = T * (SD / sqrt(N)), where SD is the standard deviation, N is the number of observation and T is the t-value with a given probability of p lying beyond it. For example, if we have 8 observation (degrees of freedom = 7), then t for .025 is 2.365. Note we use for CI95% we use a t-value of .025, as the confidence interval extends to 47.5% above and 47.5% below the mean. See nCI95% for how the CI95% values can be adjusted based on group variability. One can either compute separate confidence intervals for each condition, or compute a single global confidence interval. My software generates a single global confidence interval, as suggested by:
Data transform ANOVA and other parametric tests (such as the t-test) assume the data is normally distributed (a 'bell shaped curve'), with many scores near the mean and relatively fewer scores far above or below the mean. However, data is often 'skewed', and this can cause problems. A common rule of thumb is to apply a data transform if the zSkew of the data is greater than 1.96 or less than -1.96. The type of transform applied depends on the level of the skew. ezANOVA allows you to apply the reciprocal, log, sqrt, and arcsin transforms. To apply a transform, select the desired formula from the 'Transform' item of the 'Data' menu.
Factor A 'factor' is a category of independent variable. For example, in the within-subjects example above the noise of the room we place the participants into is a factor (in this example the factor has three levels: the rooms are either silent, have classical music or have rock music). Each factor must have at least two levels (e.g. we need to compare one setting to another).
Homogeneity of variance In addition to assuming that your data is normally distributed (not skewed), ANOVA also assumes the the variance between conditions is similar. Violating this assumption will reduce your statistical power (you will be less likely to detect differences between your conditions). Therefore, ANOVA tends to fail gracefully: it becomes more conservative if this assumption is broken rather than causing false alarms.
Level A level is a setting of the independent variable. Each factor of a study has at least two levels. For example, in the within-subjects example above experiment has three levels: the participants are in rooms that are either silent, have classical music or have rock music.
Log The log data transform is used when our data is substantially skewed (e.g. a zSkew in the range 2.33..2.58). We apply an inverse log if the data is substantially negatively skewed (zSkew in the range -2.33..-2.58). For example, the log transforms of the values 10, 100, 1000 are 1, 2, 3 respectively
Mean The mean is a measure of the central tendency (average) of a distribution. With this software 'mean' refers to the arithmetic mean of a distribution (the sum of all values divided by the number of values). So the mean of 6, 10, 12, 14 is equal to 10.5.
Mixed-measures design Also known as a 'split-plot' design. An ANOVA where at least one factor is a between-subject factor (with different individuals in each level) and at least one factor is a within-subject factor (with the same individal participating in each level).This software currently does not support mixed designs.
Multiple Comparisons As we conduct many statistical tests, we are increasingly likely to make at least one false alarm. Therefore, during multiple comparisons, our familywise error (FWE) rate increases. If we conduct 20 tests, each with a 1/20 (p<0.05) criterion, we will on average make one acidental false alarm (reporting an effect that is actually due to random chance). If we compute many unplanned pairwise comparisons, we can use the TukeyHSD to try to control for the rise in familywise error. Alternatively, we can apply Bonferroni correction to our t-tests (e.g. if we want to compute 10 tests with an overall 0.05 chance of a false alarm, we should use (0.05)/10 = 0.005 as our critical cutoff value.
n The number of values that compose a distribution.
nStDev, nCI95%, nSE, nVar When ezANOVA computes the variability measures for repeated-measures designs, it removes the variability that can be explained by knowing which subject is being tested. This resulting value is referred to as the normalized values, e.g. nCI95%. The raw CI95% is not very meaningful in repeated measures designs, as it combines both within and between subject variability. Therefore, the nCI95% is more appropriate for error bars when you graph repeated measures data. For details:

Below is an example of this technique. Consider a study where we measure the speed of typing for different individuals. Each person is tested in a noisy and quiet room (our independent measure). The dependent measure is the words per minute the people type. The raw data might look like this:

  Silent Noisy    
Anna 65 55 AnnaMean 60
Alice 56 50 AliceMean 53
Alex 62 54 AlexMean 58
Mean 61 53   57
SE 2.65 1.53    

Some people are better typists than others, so between subject variability is a major contributor to the scatter reported by the standard deviation. However, consider if we adjust each individual's observations so their scores are standardized to the grand mean of 57 words per minute (shown in blue, above). Anna types an average of 60wpm, so we will reduce each of her scores by 3. Alex types approximately 4wpm slower than the grand average, so we will increase her scores by 4. Finally, Alex types at 58wpm, so we reduce her scores by 1. The table below shows the effect of this correction: the overall grand mean (57), silent mean (61) and noisy mean (53) remain the same, but the normalized Standard Error now only measures the within subject variability, and therefore gives a more accurate measure of the variability as measured by a repeated measures test (also note that normalized values are identical across conditions):

  Silent Noisy    
nAnna 62 52 nAnnaMean 57
nAlice 60 54 nAliceMean 57
nAlex 61 53 nAlexMean 57
Mean 61 53   57
nSE 0.58 0.58    
Q See TukeyHSD
Reciprocal The reciprocal data transform is used when our data is severly skewed (e.g. a zSkew > 2.58). We apply an inverse reciprocal if the data is severly negatively skewed (zSkew < -2.58). For example, the log transforms of the values 10, 100, 1000 are 0.1, 0.001, 0.0001 respectively. Note this transform flips the direction of the values, so you need to bear this in mind when interpretting results
Repeated-measures design See within-subject design.
SE Like variance (Var) and standard deviation (StDev), the Standard Error (SE) is a measure of the variability of the data (the spread of the distribution). Standard Deviation gives us a measure of the variability of single observations. On the other hand, Standard Error is a measure of the variability of the mean. Since ANOVA and t-tests look for differences in the mean in different conditions, the SE is usually a meaningful value to use for error bars, reflecting the variability in the estimate of the mean. The SE is simple to calculate: SE = SD/sqrt(N), where N is the number of observations and SD is the standard deviation. Another estimate of the variability of the means is the Confidence Interval (CI95%). See also nSE.
StDev The "Standard Deviation" is a measure of the variability in a distribution. It is equal to the square root of the variance. When the StDev is small, most of the data has scores very close to the mean. With larger StDev's, there is much more spread in the scores.
Skew
The ANOVA assumes that data is normally distributed (a symmetrical bell-shaped curve). However, in real life, data is often skewed. For example, when looking at response times to stimuli, participants often show positively skewed data: participants can not physically respond faster than about 200ms, but there is no limit on the slower responses. Therefore, there are typically a large clump of responses slightly slower than the mean with a few very slow outlier responses. If the data is heavily skewed, you should consider a data transform (for details, see the zSkew notes). There are several ways to compute Skew, and these often give different values. ezANOVA uses the same formula as Excel: this formula is shown on the right.
Sqrt The square root (sqrt) data transform is used when our data is moderately skewed (e.g. a zSkew in the range 1.96..2.33). We apply an inverse sqrt if the data is substantially negatively skewed (zSkew in the range -1.96..-2.33). For example, the sqrt transforms of the values 10, 100, 1000 are 3.16, 10, 31.6 respectively
t see t-test
t-test A 'pairwise comparison' used to directly test two conditions. Note that this test does not automatically adjust for multiple comparisons.
Tukey HSD Tukey's Honestly Significant Difference test is a pairwise comparison that attempts to control for multiple comparisons. computes a standardized Q score.
Var
Variance is a measure of the scatter in a distribution. If the variance is low, most of the values are near the mean. On the other hand, a high variance indicates that the scores are distributed across a broad range of values. Note that ANOVA assumes that the variance is similar across conditions. As a rule of thumb, you should avoid conducting ANOVA if the variance between any conditions is greater than 4. Correcting for skew often corrects for differences in variance.
Within-subject design Also known as repeated measures designs, these use the same individuals for all conditions of an experiment. For example, in the example above we have the same individual type in three rooms with different music (none, classical or rock). These designs can substantially reduce the variability in your data. There are three facts that must be noted:
  • Not all designs are amenable to repeated measures designs. For example, the same participants can not be male in one condition and female in the next.
  • Repeated measures designs need to counterbalance against training and fatigue effects.
  • Sometimes you can match people who are similar but not identical to different conditions. For example, if we wanted to examine how different diets influenced weight loss, we might pair people who had similar body-mass-indexes.
  • Repeated mesures ANOVA introduces additional assumptions regarding the data. In particular, the assumption of sphericity.
zSkew Z-Score of the skew is used to see if data transformation is required. Typically, one should be wary of conducting an ANOVA if the zSkew for any condition is outside the range -1.96..1.96. The zSkew is calculated by dividing the Skew by the Standard Error of the Skew. SEskew is typically computed as the square root of 6 divided by n.