16|ANOVA part 2

Overview

ANOVA terminology

  • Factor
    • The independent variable
    • Designates the groups being compared
    • Between-participants / within-participants
  • Levels
    • Conditions that make up a factor
  • E.g.
    • “1x3 between-participants ANOVA”
Manipulation
🍌
Banana
🍬
Candy
😐
Control
9 3 5
11 5 6
13 4 7
\(M = 11\) \(M = 4\) \(M = 6\)

Calculating ANOVA

  • Step 1: Calculate \(SS\) and \(df\) (total, within, between)

\(SS_{total}\)

\(SS_{within}\) \(SS_{between}\)

\(df_{total}\)

\(df_{within}\) \(df_{between}\)

  • Step 2: Calculate variances (Mean Squares)

\(MS_{between} = \dfrac{SS_{between}}{df_{between}}\)

\(MS_{within} = \dfrac{SS_{within}}{df_{within}}\)

  • Step 3: Calculate \(F\)-ratio

\(F = \dfrac{MS_{between}}{MS_{within}}\)

\(SS\) and \(df\)

\(SS_{total} = \Sigma X^2 - \dfrac{G^2}{N}\) \(df_{total} = N-1\)

\(SS_{within} = \Sigma SS_{each \ treamtent}\) \(df_{within} = N-k\)

\(SS_{between} = \Sigma \dfrac{T^2}{n} - \dfrac{G^2}{N}\) \(df_{between} = k-1\)

Symbol Meaning
\(k\) Number of treatment conditions
\(n_1, n_2...\) Number of scores in each treatment
\(N\) Total number of scores
\(T_1, T_2...\) Sum of scores \((\Sigma X)\) for each treatment
\(G\) Grand total of all scores in the study

Summary table

Source \(SS\) \(df\) \(MS\) \(F\)
Between treatments
Within treatments
Total

Hypothesis test

Step 1. State hypotheses

  • \(H_0: \mu_1 = \mu_2 = \mu_3\)
    • No treatment effect
    • Numerator & denominator should be about the same
    • \(F\) should be near \(1.00\)
  • \(H_1\) : At least one population mean differs from another
    • There is some treatment effect
    • Numerator bigger than denominator
    • \(F\) should be noticeably larger than \(1.00\)

Step 2. Critical region

  • Like \(t\) distributions, there is a different \(F\) distribution for each value of \(df\)
    • Now we have two different \(df\) values
    • \(df\) numerator \((df_{between})\)
    • \(df\) denominator \((df_{within})\)
    • Note, distribution isn’t symmetrical
    • \(F\) values are always positive

\(F\) table

\(\alpha = .05\)
\(df_{numerator}\)
\(df_{denominator}\) 1 2 3 4 5 6 7 8 9 10
1 161.45 199.50 215.71 224.58 230.16 233.99 236.77 238.88 240.54 241.88
2 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.39 19.40
3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14
10 4.96 4.10 3.71 3.48 3.33 3.22 3.13 3.07 3.02 2.98

1

1

Step 3. Calculate test statistic

Manipulation
🍌
Banana
🍬
Candy
😐
Control
9 3 5
11 5 6
13 4 7
\(M = 11\) \(M = 4\) \(M = 6\)

\(N = 9\)
\(n = 3\)
\(k = 3\)
\(G = 63\)
\(\Sigma X^2 = 531\)

Step 3. \(SS\) and \(df\)s

\(SS_{total} = \Sigma X^2 - \dfrac{G^2}{N} = 531 - \dfrac{63^2}{9} = 90\)

\(SS_{within} = \Sigma SS_{each \ treamtent} = 8+2+2 = 12\)

\(SS_{between} = \Sigma \dfrac{T^2}{n} - \dfrac{G^2}{N} = \dfrac{33^2}{3}+\dfrac{12^2}{3}+\dfrac{18^2}{3} - \dfrac{63^2}{9} = 78\)

\(df_{total} = N-1 = 8\)

\(df_{within} = N-k = 6\)

\(df_{between} = k-1 = 2\)

Step 3. \(MS\) and \(F\)

\(MS_{between} = \dfrac{SS_{between}}{df_{between}} = 39\)

\(MS_{within} = \dfrac{SS_{within}}{df_{within}} = 2\)

\(F = \dfrac{MS_{between}}{MS_{within}} = \dfrac{39}{2} = 19.5\)

Step 4. Make decision

  • \(F > F_{critical}\)?
    • Reject or fail to reject \(H_0\)
  • Step 4b. Effect size
    • Compute percentage of variance accounted for by treatment
    • \(r^2\) concept (proportion of variance explained)
    • For ANOVA called \(\eta^2\) (“eta squared”)

\(\eta^2 = \dfrac{SS_{between}}{SS_{total}} = 0.87\)

Report results

  • Descriptives
    • Treatment means and standard deviations are presented in text, table and/or graph
  • Hypothesis test outcome
    • Results of ANOVA are summarized, including
    • \(F\) and \(df\) values, \(p\), \(\eta^2\) (if significant)

A single-factor, independent-samples ANOVA revealed a significant difference between people who consumed a banana (\(M = 11\); \(SD = 2\)), a candy bar (\(M = 4\); \(SD = 1\)), and the control condition \((M = 6\); \(SD = 1)\); \(F(2,6) = 19.5\), \(p < .05\), \(\eta^2 = 0.87\).

Post-hoc tests

  • Post hoc tests compare two means at a time
    • Pairwise comparisons
    • Each comparison includes risk of a Type I error
    • Risk of Type I error accumulates
    • Experimentwise alpha level, \(\alpha_{experimentwise}\)
  • Post-hoc tests use special methods to control experimentwise Type I error rate

Tukey’s \(HSD\)

  • Tukey’s Honestly Significant Difference
    • Minimum difference between pairs of treatment means so that \(p < \alpha_{experimentwise}\)
    • \(q\) is the Studentized Range statistic
    • Depends on \(\alpha\), \(k\), and \(df\) for denominator
    • Find \(q\) in table or R

\[\begin{align} HSD &= q \sqrt{\dfrac{MS_{within}}{n}} \\ &= 4.34 \sqrt{\dfrac{2}{3}} \\ &= 3.54 \end{align}\]

\(df\)
Number of Conditions
2 3 4 5 6
5 3.64 4.60 5.22 5.67 6.03
6 3.46 4.34 4.90 5.30 5.63
7 3.34 4.16 4.68 5.06 5.36
8 3.26 4.04 4.53 4.89 5.17
9 3.20 3.95 4.41 4.76 5.02
10 3.15 3.88 4.33 4.65 4.91

Tukey’s \(HSD\)

  • Tukey’s Honestly Significant Difference
    • Minimum difference between pairs of treatment means so that \(p < \alpha_{experimentwise}\)
    • \(q\) is the Studentized Range statistic
    • Depends on \(\alpha\), \(k\), and \(df\) for denominator
    • Find \(q\) in table or R

\[\begin{align} HSD &= q \sqrt{\dfrac{MS_{within}}{n}} \\ &= 4.34 \sqrt{\dfrac{2}{3}} \\ &= 3.54 \end{align}\]

Manipulation
🍌
Banana
🍬
Candy
😐
Control
9 3 5
11 5 6
13 4 7
\(M = 11\) \(M = 4\) \(M = 6\)

Assumptions

  1. Observations within each sample must be independent
  2. Population from which the samples are selected must be normal
  3. Populations from which the samples are selected must have equal variances (homogeneity of variance)

Learning checks

  • For independent-samples (between-participants) ANOVA, what do the following represent?
    • \(N\)
    • \(n_1 ,n_2 , n_3\) , etc
    • \(G\)
    • \(T\)
    • \(MS_{between}\)
    • \(MS_{within}\)