13|INDEPENDENT-SAMPLES $t$-TEST

Overview

Research design
Calculation
Hypothesis test
Learning checks

Research design

Single-sample $t$-test design

Compare sample against expected population mean based on logic/theory/scale design
E.g. give everyone $10 💵

What is your current level of happiness?

1. A lot less than usual
2. A little less than usual
3. About average
4. A little more than usual
5. A lot more than usual

$\mu = 3$

import {likert} from "../ojs/utils.qmd";

Single-sample $t$-test logic

Partially known
original population
$\mu$

Unknown
treated
population

Sample

Treated sample $n, M, SD$

Independent-samples $t$-test logic

Unknown
treated population
A

Unknown
treated population
B

Sample A
$n, M, SD$

Sample B
$n, M, SD$

Calculation

Basic structure of all $t$-tests:

$t = \dfrac{ \text{sample statistic} - \text{population parameter} }{\text{estimated standard error}}$

$t = \dfrac{ \text{how different was observed from predicted?} }{\text{how big a difference would we expect by chance?}}$

$t = \dfrac{ \text{data} - \text{hypothesis} }{\text{error}}$

Single-sample $t$

$t = \dfrac{M-\mu}{s_M}$

Independent-samples $t$

$t = \dfrac{(M_1-M_2)-(\mu_1-\mu_2)}{s_{M_1-M_2}}$

Equations

Denominator: $s_{M_1-M_2}$
- Estimated standard error of the mean difference

$s_{M_1-M_2} = \sqrt{\dfrac{s_p^2}{n_1}+\dfrac{s_p^2}{n_2}}$

$s_p^2$: Pooled variance
- Weighted average of two sample variances

$\begin{align} s_p^2 = &\dfrac{SS_1+SS_2}{df_1+df_2} \\ \\ \text{or... } &\dfrac{df_1*s_1^2 + df_2*s_2^2}{df_1+df_2} \\ \text{because... } s^2 = &\dfrac{SS}{df} \text{ so... } SS = df*s^2 \end{align}$

Calculating independent samples $t$

Pooled variance: $s_p^2 = \dfrac{SS_1+SS_2}{df_1+df_2}$
Estimated standard error of mean difference: $s_{M_1-M_2} = \sqrt{\dfrac{s_p^2}{n_1}+\dfrac{s_p^2}{n_2}}$
$t$ statistic: $t = \dfrac{(M_1-M_2)-(\mu_1-\mu_2)}{s_{M_1-M_2}}$

Hypothesis test

Step 1: State hypotheses

Null: There is no difference between groups
- The treatment has no effect
- $μ_1 – μ_2 = 0$
Alternative: There is a difference
- The treatment has an effect
- Directional: $μ_1 – μ_2 < 0$ or $μ_1 – μ_2 > 0$
- Nondirectional: $μ_1 – μ_2 \ne 0$

Step 2: Define critical region

Depends on $\alpha$ and $df$

\[ \begin{align} df &= df_1 + df_2 \\ &= (n_1 – 1) + (n_2 – 1) \\ &= N - 2 \end{align} \]

Proportion in 1 tail	0.025
Proportion in 2 tails	0.05
1	12.706
2	4.303
3	3.182
4	2.776
5	2.571
6	2.447
7	2.365
$df$ 8	2.306
9	2.262
10	2.228
11	2.201
12	2.179
13	2.160
14	2.145
15	2.131
...	...

Step 3: Calculate $t$ statistic

Spend $10 on self

$X$	$X-M$	$(X-M)^2$
1	-2	4
5	2	4
2	-1	1
4	1	1
3	0	0
$M = 3.00$		$SS = 10.00$
		$s^2 = 2.50$
		$s = 1.58$

Spend $10 on other

$X$	$X-M$	$(X-M)^2$
5	1	1
5	1	1
2	-2	4
5	1	1
3	-1	1
$M = 4.00$		$SS = 8.00$
		$s^2 = 2.00$
		$s = 1.41$

Step 3 : Calculate $t$ statistic

$s_p^2 = \dfrac{SS_1+SS_2}{df_1+df_2} = \dfrac{10 + 8}{4 + 4} = 2.25$

$s_{M_1-M_2} = \sqrt{\dfrac{s_p^2}{n_1}+\dfrac{s_p^2}{n_2}} = \sqrt{\dfrac{2.25}{5}+\dfrac{2.25}{5}} = 0.95$

$t = \dfrac{(M_1-M_2)-(\mu_1-\mu_2)}{s_{M_1-M_2}} = \dfrac{3 - 4}{0.95} = -1.05$

Step 4: Make decision

Step 4b: Effect size

Cohen’s $d$ for independent samples

\[\begin{align} d &= \dfrac{\text{difference between means}}{\text{pooled standard deviation}} \\ &= \dfrac{(M_1 - M_2) - (\mu_1 - \mu_2)}{\sqrt{s^2_p}} \\ &= \dfrac{3 - 4}{\sqrt{2.25}} \\ &= -0.67 \end{align}\]

Note: not required for nonsignificant differences

Step 5: Report results

A two-tailed independent-samples $t$ test suggested that the difference in average happiness between people in the “Spend on self group” $(M = 3$; $SD = 1.58)$ and the “Spend on other” group $(M = 4$; $SD = 1.41)$ was nonsignificant; $t(8) =$ $-1.05$, $p > .05$.

Assumptions

Assumptions for independent-samples $t$-tests

The observations within each sample must be independent
The two populations from which the samples are selected must be normal
- Can be ignored with large enough sample size
The two populations from which the samples are selected must have equal variances
- Homogeneity of variance
- Because pooled variance is a weighted average

Homogeneity of variance

Testing the homogeneity of variance assumption
- Hartley’s F-max test

$F_{max} = \dfrac{s^2_{largest}}{s^2_{smallest}}$

Small value (near 1) indicates similar sample variances, larger values indicate larger difference
Look up associated critical value for $F$-max test
If value exceeds critical value, indicates homogeneity assumption has been violated

Homogeneity of variance correction

If homogeneity of variance assumption is violated…
- Calculate standard error without pooled variance
- Adjust $df$ using equation:

\[df = \dfrac{(\dfrac{s_1^2}{n_1}+\dfrac{s_2^2}{n_2})} {\dfrac{(\dfrac{s_1^2}{n_1})^2}{n_1-1} + \dfrac{(\dfrac{s_2^2}{n_2})^2}{n_2-1} }\]

Homogeneity of variance correction

…or let R do the work for you
- t.test() function automatically applies correction
- Specify var.equal = TRUE to override

conditionA <- c(1, 5, 2, 4, 3)
conditionB <- c(5, 5, 2, 5, 3)

t.test(x = conditionA, y = conditionB)


    Welch Two Sample t-test

data:  conditionA and conditionB
t = -1.0541, df = 7.9024, p-value = 0.323
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -3.192378  1.192378
sample estimates:
mean of x mean of y 
        3         4

conditionA <- c(1, 5, 2, 4, 3)
conditionB <- c(5, 5, 2, 5, 3)

t.test(x = conditionA, y = conditionB, 
       var.equal = TRUE)


    Two Sample t-test

data:  conditionA and conditionB
t = -1.0541, df = 8, p-value = 0.3226
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -3.187668  1.187668
sample estimates:
mean of x mean of y 
        3         4

Confidence interval

Quantifies precision of point estimate
- Distribution of differences between two samples drawn from identical populations

Rearrange $t$ equation:

$t = \dfrac{(M_1-M_2)-(\mu_1-\mu_2)}{s_{M_1-M_2}}$

To solve for parameter $(\mu_1 - \mu_2)$:

$(\mu_1-\mu_2) = (M_1-M_2) \pm t * s_{M_1-M_2}$

Confidence interval

\[\begin{align} (\mu_1 - \mu_2) &= (M_1 - M_2) \pm t * s_{M_1 - M_2} \\ &= -1 \pm 2.31 * 0.95 \\ &= -3.19, 1.19 \end{align}\]

import { addCIPlot } from "../ojs/confidence-interval.qmd"

chart = {
  d3.select("#ci-independent")
    .call(addCIPlot, {test_type: "independent",
                      point_estimate: -1,
                      standard_deviation: 1.5,
                      n: 10,
                      ci: 95,
                      disable_controls: true})
}

Learning checks

A researcher performing an independent-samples $t$-test finds a difference between groups of 2.3, and calculates a 95% confidence interval of [0.3, 4.3]. Can you predict whether the hypothesis test will reject the null hypothesis?
- It will reject the null
- It will fail to reject the null
- Cannot make a prediction
A confidence interval for an independent-samples test which includes a value of zero within its confidence range establishes that true population parameter is 95% certain to be zero.
- True
- False

cover = {
  const w = 1050
  const h = 500
  const x = d3.scaleLinear()
    .domain([40,180])
    .range([0,w])
  const y = d3.scaleLinear()
    .domain([0,70])
    .range([h,0])
  const fill = d3.scaleOrdinal()
    .domain(["A","B", "AB"])
    .range(["#dda6a6", "#b0d3ff", "#9c9ecb"])
    
  const svg = d3.select("#cover").append("svg").attr("width", w).attr("height", h)
  
  svg.selectAll("rect").data(data).enter().append("rect")
    .attr("x", d => x(d.value))
    .attr("y", d => y(d.y_cum))
    .attr("width", 6)
    .attr("height", 6)
    .style("fill", d => fill(d.group))
}

\(X\)	\(X-M\)	\((X-M)^2\)
1	-2	4
5	2	4
2	-1	1
4	1	1
3	0	0
\(M = 3.00\)		\(SS = 10.00\)
		\(s^2 = 2.50\)
		\(s = 1.58\)

\(X\)	\(X-M\)	\((X-M)^2\)
5	1	1
5	1	1
2	-2	4
5	1	1
3	-1	1
\(M = 4.00\)		\(SS = 8.00\)
		\(s^2 = 2.00\)
		\(s = 1.41\)

13|INDEPENDENT-SAMPLES \(t\)-TEST

Overview

Research design

Single-sample \(t\)-test design

Single-sample \(t\)-test logic

Independent-samples design

Independent-samples \(t\)-test logic

Calculation

Equations

Calculating independent samples \(t\)

Hypothesis test

Step 1: State hypotheses

Step 2: Define critical region

Step 3: Calculate \(t\) statistic

Step 3 : Calculate \(t\) statistic

Step 4: Make decision

Step 4b: Effect size

Step 5: Report results

Assumptions

Homogeneity of variance

Homogeneity of variance correction

Homogeneity of variance correction

Confidence interval

Confidence interval

Learning checks