13|INDEPENDENT-SAMPLES \(t\)-TEST

Overview

Research design

Single-sample \(t\)-test design

  • Compare sample against expected population mean based on logic/theory/scale design
  • E.g. give everyone $10 💵

What is your current level of happiness?






\(\mu = 3\)

Single-sample \(t\)-test logic

Partially known
original population
\(\mu\)

Unknown
treated
population

Sample

Treated sample \(n, M, SD\)

Independent-samples design

What if… give everyone $10

Group A:
Spend this on yourself 💵

What is your current level of happiness?






Group B:
Spend this on someone else 💵

What is your current level of happiness?






Dunn, E. W., Aknin, L. B., & Norton, M. I. (2014). Prosocial spending and happiness: Using money to benefit others pays off. Current Directions in Psychological Science, 23(1), 41-47. https://doi.org/10.1177/0963721413512503

Independent-samples \(t\)-test logic

Unknown
treated population
A

Unknown
treated population
B

Sample A
\(n, M, SD\)

Sample B
\(n, M, SD\)

Calculation

Basic structure of all \(t\)-tests:

\(t = \dfrac{ \text{sample statistic} - \text{population parameter} }{\text{estimated standard error}}\)

\(t = \dfrac{ \text{how different was observed from predicted?} }{\text{how big a difference would we expect by chance?}}\)

\(t = \dfrac{ \text{data} - \text{hypothesis} }{\text{error}}\)

Single-sample \(t\)

\(t = \dfrac{M-\mu}{s_M}\)

Independent-samples \(t\)

\(t = \dfrac{(M_1-M_2)-(\mu_1-\mu_2)}{s_{M_1-M_2}}\)

Equations

  • Denominator: \(s_{M_1-M_2}\)
    • Estimated standard error of the mean difference

\(s_{M_1-M_2} = \sqrt{\dfrac{s_p^2}{n_1}+\dfrac{s_p^2}{n_2}}\)

  • \(s_p^2\): Pooled variance
    • Weighted average of two sample variances

\(\begin{align} s_p^2 = &\dfrac{SS_1+SS_2}{df_1+df_2} \\ \\ \text{or... } &\dfrac{df_1*s_1^2 + df_2*s_2^2}{df_1+df_2} \\ \text{because... } s^2 = &\dfrac{SS}{df} \text{ so... } SS = df*s^2 \end{align}\)

Calculating independent samples \(t\)

  1. Pooled variance: \(s_p^2 = \dfrac{SS_1+SS_2}{df_1+df_2}\)
  2. Estimated standard error of mean difference: \(s_{M_1-M_2} = \sqrt{\dfrac{s_p^2}{n_1}+\dfrac{s_p^2}{n_2}}\)
  3. \(t\) statistic: \(t = \dfrac{(M_1-M_2)-(\mu_1-\mu_2)}{s_{M_1-M_2}}\)

Hypothesis test

Step 1: State hypotheses

  • Null: There is no difference between groups
    • The treatment has no effect
    • \(μ_1 – μ_2 = 0\)
  • Alternative: There is a difference
    • The treatment has an effect
    • Directional: \(μ_1 – μ_2 < 0\) or \(μ_1 – μ_2 > 0\)
    • Nondirectional: \(μ_1 – μ_2 \ne 0\)

Step 2: Define critical region

  • Depends on \(\alpha\) and \(df\)

\[ \begin{align} df &= df_1 + df_2 \\ &= (n_1 – 1) + (n_2 – 1) \\ &= N - 2 \end{align} \]

Proportion
in 1 tail

0.025
Proportion
in 2 tails
0.05
1 12.706
2 4.303
3 3.182
4 2.776
5 2.571
6 2.447
7 2.365
\(df\)       8 2.306
9 2.262
10 2.228
11 2.201
12 2.179
13 2.160
14 2.145
15 2.131
... ...

Step 3: Calculate \(t\) statistic

Spend $10 on self

\(X\) \(X-M\) \((X-M)^2\)
1 -2 4
5 2 4
2 -1 1
4 1 1
3 0 0
\(M = 3.00\) \(SS = 10.00\)
\(s^2 = 2.50\)
\(s = 1.58\)

Spend $10 on other

\(X\) \(X-M\) \((X-M)^2\)
5 1 1
5 1 1
2 -2 4
5 1 1
3 -1 1
\(M = 4.00\) \(SS = 8.00\)
\(s^2 = 2.00\)
\(s = 1.41\)

Step 3 : Calculate \(t\) statistic

\(s_p^2 = \dfrac{SS_1+SS_2}{df_1+df_2} = \dfrac{10 + 8}{4 + 4} = 2.25\)


\(s_{M_1-M_2} = \sqrt{\dfrac{s_p^2}{n_1}+\dfrac{s_p^2}{n_2}} = \sqrt{\dfrac{2.25}{5}+\dfrac{2.25}{5}} = 0.95\)


\(t = \dfrac{(M_1-M_2)-(\mu_1-\mu_2)}{s_{M_1-M_2}} = \dfrac{3 - 4}{0.95} = -1.05\)

Step 4: Make decision

Step 4b: Effect size

  • Cohen’s \(d\) for independent samples

\[\begin{align} d &= \dfrac{\text{difference between means}}{\text{pooled standard deviation}} \\ &= \dfrac{(M_1 - M_2) - (\mu_1 - \mu_2)}{\sqrt{s^2_p}} \\ &= \dfrac{3 - 4}{\sqrt{2.25}} \\ &= -0.67 \end{align}\]

  • Note: not required for nonsignificant differences

Step 5: Report results

A two-tailed independent-samples \(t\) test suggested that the difference in average happiness between people in the “Spend on self group” \((M = 3\); \(SD = 1.58)\) and the “Spend on other” group \((M = 4\); \(SD = 1.41)\) was nonsignificant; \(t(8) =\) \(-1.05\), \(p > .05\).

Assumptions

Assumptions for independent-samples \(t\)-tests

  1. The observations within each sample must be independent
  2. The two populations from which the samples are selected must be normal
    • Can be ignored with large enough sample size
  3. The two populations from which the samples are selected must have equal variances
    • Homogeneity of variance
    • Because pooled variance is a weighted average

Homogeneity of variance

  • Testing the homogeneity of variance assumption
    • Hartley’s F-max test

\(F_{max} = \dfrac{s^2_{largest}}{s^2_{smallest}}\)

  • Small value (near 1) indicates similar sample variances, larger values indicate larger difference
  • Look up associated critical value for \(F\)-max test
  • If value exceeds critical value, indicates homogeneity assumption has been violated

Homogeneity of variance correction

  • If homogeneity of variance assumption is violated…
    • Calculate standard error without pooled variance
    • Adjust \(df\) using equation:

\[df = \dfrac{(\dfrac{s_1^2}{n_1}+\dfrac{s_2^2}{n_2})} {\dfrac{(\dfrac{s_1^2}{n_1})^2}{n_1-1} + \dfrac{(\dfrac{s_2^2}{n_2})^2}{n_2-1} }\]

Homogeneity of variance correction

  • …or let R do the work for you
    • t.test() function automatically applies correction
    • Specify var.equal = TRUE to override
conditionA <- c(1, 5, 2, 4, 3)
conditionB <- c(5, 5, 2, 5, 3)

t.test(x = conditionA, y = conditionB)

    Welch Two Sample t-test

data:  conditionA and conditionB
t = -1.0541, df = 7.9024, p-value = 0.323
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -3.192378  1.192378
sample estimates:
mean of x mean of y 
        3         4 
conditionA <- c(1, 5, 2, 4, 3)
conditionB <- c(5, 5, 2, 5, 3)

t.test(x = conditionA, y = conditionB, 
       var.equal = TRUE)

    Two Sample t-test

data:  conditionA and conditionB
t = -1.0541, df = 8, p-value = 0.3226
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -3.187668  1.187668
sample estimates:
mean of x mean of y 
        3         4 

Confidence interval

  • Quantifies precision of point estimate
    • Distribution of differences between two samples drawn from identical populations

Rearrange \(t\) equation:

\(t = \dfrac{(M_1-M_2)-(\mu_1-\mu_2)}{s_{M_1-M_2}}\)

To solve for parameter \((\mu_1 - \mu_2)\):

\((\mu_1-\mu_2) = (M_1-M_2) \pm t * s_{M_1-M_2}\)

Confidence interval

\[\begin{align} (\mu_1 - \mu_2) &= (M_1 - M_2) \pm t * s_{M_1 - M_2} \\ &= -1 \pm 2.31 * 0.95 \\ &= -3.19, 1.19 \end{align}\]

Learning checks

  1. A researcher performing an independent-samples \(t\)-test finds a difference between groups of 2.3, and calculates a 95% confidence interval of [0.3, 4.3]. Can you predict whether the hypothesis test will reject the null hypothesis?
    • It will reject the null
    • It will fail to reject the null
    • Cannot make a prediction
  2. A confidence interval for an independent-samples test which includes a value of zero within its confidence range establishes that true population parameter is 95% certain to be zero.
    • True
    • False