12|THE \(t\)-TEST part 2

Overview

Research designs
Assumptions
Effect size
Confidence intervals
Learning checks

Research designs

Single-sample \(t\)-test
- Compare sample against expected population mean based on logic/theory/scale design

E.g. ‘common sense’ theory
- Population average amount of sleep

\(\mu = 8\) hours

Research designs

E.g. measure of happiness

What is your current level of happiness?

1. A lot less than usual
2. A little less than usual
3. About average
4. A little more than usual
5. A lot more than usual

\(\mu = 3\)

Assumptions

Assumptions for single-sample \(t\)-tests
1. Independence
  - Independent random sampling
  - Values in the sample are independent observations
2. Normality
  - The population sampled is normally distributed
  - With large samples \((n > 30)\), this assumption can be violated without affecting the validity of the hypothesis test
3. Homogeneity of variance
  - Variability in the original and treated populations is the same

Effect size

Hypothesis test Step 4: Make decision (reject null?)
Step 4b: Evaluate effect size
- Cohen’s \(d\) for single-sample \(t\)-test
- Original equation included population SD, \(\sigma\)
- Estimated Cohen’s \(d\) uses sample SD, \(s\)

\(\text{Estimated } d = \dfrac{\text{mean difference}}{\text{sample standard deviation}} = \dfrac{M - \mu}{s}\)

\(\text{For class RT data, } d = \dfrac{322.59 - 284}{45.31} = 0.85\)

Effect size: \(r^2\)

Proportion of all variability in the data attributable to treatment effect
Simplifying assumption: Treatment adds or subtracts a constant to each score
E.g. 1 point on a scale of 1 to 5
\(r^2\) separates that variability due to treatment from natural variability between scores

\(r^2 = \dfrac{SS_{treatment}}{SS_{total}}\)

\(r^2\)

Calculate sum of squared deviations from sample \(M\)
- Variability excluding treatment effect
- \(SS_{without \ treatment}\)
Calculate \(SS\) from \(H_0\) \(\mu\)
- This is total variability
- \(SS_{total}\)
Substract \(SS_{without \ treatment}\) from \(SS_{total}\) to find \(SS_{treatment}\)
- Variability attributable to treatment effect

\[\begin{align} r^2 = \dfrac{SS_{treatment}}{SS_{total}} &= \dfrac{SS_{total} - SS_{without \ treatment}}{SS_{total}} \\ &= \dfrac{10-6}{10} = 0.4 \end{align}\]

\(r^2\)

If we already calculated \(t\)…

\(r^2 = \dfrac{t^2}{t^2 + df}\)

Works for any kind of \(t\)-test
- Single / related / independent-samples
Interpreting \(r^2\)
- \(r^2 = 0.01\): small effect
- \(r^2 = 0.09\): medium effect
- \(r^2 = 0.25\): large effect

Reporting results

Given the average reaction time for the population of \(\mu = 284 ms\), according to humanbenchmark.com, a two-tailed single-sample \(t\)-test suggests that BC1101 students have significantly different reaction times \((M = 322.59\); \(SD = 45.31)\) than the general population; \(t(22) =\) \(4.08\), \(p < .05\), \(d = 0.85\).

Confidence intervals

Complementary to significance & effect size
Quantifies precision of sample estimate
Comprised of:
- The point estimate
  - Our best guess of the population parameter
- Margin of error
  - A range either side of point estimate
  - Indicates the amount of uncertainty surrounding estimate of population mean
  - Based on desired ‘confidence’, i.e. range of the distribution
  - E.g. 95%, 99%, 80%, etc…

Calculating CI boundaries

So far, we have been specifying \(\mu\), calculating \(M\) and \(s_M\), solving for \(t\)
For CI, rearrange to solve for \(\mu\)
- Calculate \(M\) and \(s_M\), specify \(t\) (based on desired width of CI —99%, 95%, 90%, 80% etc), solve for \(\mu\)

\(t = \dfrac{M - \mu}{s_M}\)

\(\mu = M \pm t * s_M\)

Confidence interval interpretation

What does a confidence interval tell us?
- Indicates precision of parameter estimate
- Quantifies variability around a single point estimate
- NOT “we are 95% sure the true population mean is within this range”
- NOT “sample means from this population will fall within this range 95% of the time”

“The parameter is an unknown constant and no probability statement concerning its value may be made.”¹

Factors that affect CI width

Point estimate:

Variability:

n = 30

CI: 95

CI & NHST

\(p\) value and CI always agree about statistical significance if CI is \(1 – alpha\)
- E.g. \(\alpha = .05\) and 95% confidence interval
If the \(p < \alpha\), the confidence interval will not contain the null hypothesis value
If the confidence interval does not contain the null hypothesis value, the results are statistically significant
Both significance level and confidence level define a distance from a mean to a limit
- The distances in both cases are exactly the same

CI & NHST demonstration

n = 15

d = 0.7

Show: \(H_0\) \(H_1\) Both

Learning checks

What value of \(t\) would you expect to see if the null hypothesis is true?
Which combination of factors is most likely to produce a significant value for the \(t\) statistic?
- Small mean difference and large sample variability
- Small mean difference and small sample variability
- Large mean difference and large sample variability
- Large mean difference and small sample variability

jStat = require("../js/jstat.js")

ci = {

  const w = 1050, h = 600;
  const margin = {bottom: 30};
  const f = d3.format(".2f");
  var mean, sd, confidence, n, std_err, xlim, ci_x_lim;
  
  const x = d3.scaleLinear()
    .range([0, w])
  const xRaw = d3.scaleLinear()
    .domain([-1, 1])
    .range([0, w])
  const y = d3.scaleLinear()
    .domain([0, 0.43])
    .range([h - margin.bottom, 0])
  const line = d3.line()
    .x(d => x(d.value))
    .y(d => y(d.density));
  const xAxis = d3.axisBottom(x);
  const xAxisRaw = d3.axisBottom(xRaw).tickSize(10);
  
  function makeCurve(xlim) {
    var arr = [];
    var x = jStat(-xlim, xlim, 210)[0];
    for (var i = 0; i < x.length; i++) {
      arr.push({value: x[i], density: jStat.studentt.pdf(x[i], n - 1)})
    }
    return arr
  }
  
  const svg = d3.select("#ci-container").append("svg")
    .attr("width", w).attr("height", h)
    
  const axis = svg.append("g")
    .attr("transform", `translate(0, ${y(0)})`)
    .style("color", "steelblue")
    .call(xAxis);
    
  const axisRaw = svg.append("g")
    .attr("transform", `translate(0, ${y(0)})`)
    .call(xAxisRaw);
    
  const defs = svg.append("defs")
  const mask = defs.append("mask").attr("id", "mask")
  const mask_rect = mask.append("rect")
  .attr("height", h)
  .style("fill", "white")
  
  const ci_fill = svg.append("path")
    .attr("mask", "url(#mask)")
    .style("stroke", "none").style("fill", "lightblue")
  
  const ci_curve = svg.append("path")
    .attr("class", "invertable")
    .style("stroke", "black")
    .style("fill", "none")
    .style("stroke-width", 4)
  
  const ci_line = svg.append("line")
    .style("stroke", "black").style("stroke-dasharray", [5,5])
  
  const ci_point_estimate = svg.append("line")
    .style("stroke", "black").style("stroke-dasharray", [5,5])
    
  const ci_limit_labels = svg.append("g")
  const ci_limit_lower = ci_limit_labels.append("text")
  const ci_limit_upper = ci_limit_labels.append("text").style("text-anchor", "end")
  
  const inputCI =  d3.select("#input-ci")
  const inputN =  d3.select("#input-n")
  const inputMu =  d3.select("#input-mu")
  const inputSigma =  d3.select("#input-sigma")
  
  function initialize() {
    mean = Number(inputMu.property("value"));
    sd = inputSigma.property("value");
    confidence = inputCI.property("value");
    n = inputN.property("value");
    std_err = sd / Math.sqrt(n);
    updateCI();
  }
  
  inputMu.on("input", function() {
    mean = Number(this.value);
    updateCI();
  })
  
  inputSigma.on("input", function() {
    sd = Number(this.value);
    std_err = sd / Math.sqrt(n);
    updateCI();
  })
  
  inputCI.on("input", function() {
    confidence = Number(this.value);
    d3.select("#value-ci").text(confidence);
    
    updateCI();
  })
  
  inputN.on("input", function() {
    n = Number(this.value);
    d3.select("#value-n").text(n);
    std_err = sd / Math.sqrt(n);

    updateCI();
  })

  function updateCI() {
    xlim = sd / std_err;
    x.domain([-xlim, xlim]);
    axis.call(xAxis);
    xRaw.domain([mean - sd, mean + sd]);
    axisRaw.call(xAxisRaw);
    
    ci_x_lim = jStat.studentt.inv((1 - (confidence/100)) / 2, n - 1);
    var line_height = jStat.studentt.pdf(ci_x_lim, n-1) / 2;
    
    var ci_lims = [mean + ci_x_lim * std_err, mean - ci_x_lim * std_err];
    
    ci_limit_lower
      .text(f(ci_lims[0]))
      .attr("x", x(ci_x_lim))
      .attr("y", y(line_height))
    ci_limit_upper
      .text(f(ci_lims[1]))
      .attr("x", x(-ci_x_lim))
      .attr("y", y(line_height))
    
    ci_fill.attr("d", line(makeCurve(15)))
    ci_curve.attr("d", line(makeCurve(xlim)))
    
    mask_rect
      .attr("x", x(ci_x_lim))
      .attr("width", x(-ci_x_lim) - x(ci_x_lim));
    
    ci_line
      .attr("transform", `translate(0, ${y(line_height)})`)
      .attr("x1", x(-ci_x_lim)).attr("x2", x(ci_x_lim))
    ci_point_estimate
      .attr("transform", `translate(${x(0)}, 0)`)
      .attr("y1", y(jStat.studentt.pdf(0, n - 1))).attr("y2", y(0))
  }
  
  initialize();
}

ci_nhst = {

  const w = 1050, h = 500;
  const margin = {bottom: 5};
  const xlim = [-1, 1.5];
  const sigma = 1;
  const confidence = 95;
  var cohensD, n, std_err, ci_x_lim;
  
  const x = d3.scaleLinear()
    .domain(xlim)
    .range([0, w])
  const y = d3.scaleLinear()
    .domain([0, 2])
    .range([h - margin.bottom, 0])
  const line = d3.line()
    .x(d => x(d.value))
    .y(d => y(d.density));
  const xAxis = d3.axisBottom(x);
  
  function makeCurve(mu, std_err) {
    var arr = [];
    var x = jStat(xlim[0], xlim[1], 210)[0];
    for (var i = 0; i < x.length; i++) {
      arr.push({value: x[i], density: jStat.normal.pdf(x[i], mu, std_err)})
    }
    return arr
  }
  function makeFill(mu, std_err) {
    var arr = [];
    var x = jStat(-5, 5, 310)[0];
    for (var i = 0; i < x.length; i++) {
      arr.push({value: x[i], density: jStat.normal.pdf(x[i], mu, std_err)})
    }
    return arr
  }
  
  const svg = d3.select("#ci-nhst-container").append("svg")
    .attr("width", w).attr("height", h)
    
  <!-- const axis = svg.append("g").attr("transform", `translate(0, ${y(0)})`).call(xAxis); -->
  
  const ci_dist = svg.append("g")
  const null_dist = svg.append("g")
  

  const defs = svg.append("defs")
  const mask = defs.append("mask").attr("id", "nhst-mask")
  const mask_rect = mask.append("rect")
  .attr("height", h)
  .style("fill", "white")
  const null_mask = defs.append("mask").attr("id", "null-mask")
  null_mask.append("rect").attr("width", w).attr("height", h).style("fill", "white")
  const null_rect = null_mask.append("rect")
    .attr("height", h)
    .style("fill", "black")
  
  const null_fill = null_dist.append("path")
    .attr("mask", "url(#null-mask)")
    .style("stroke", "none")
    .style("fill", "red")
    .style("opacity", 0.7)
  const null_curve = null_dist.append("path")
    .attr("class", "invertable")
    .style("stroke", "red")
    .style("fill", "none")
    .style("stroke-width", 4)
    .style("stroke-dasharray", [10, 10])
  const null_mu = null_dist.append("line")
    .attr("transform", `translate(${x(0)}, 0)`)
    .attr("y1", y(0))
    .style("stroke", "red").style("stroke-dasharray", [5,5])
  const null_margin = null_dist.append("line")
    .style("stroke", "red").style("stroke-dasharray", [5,5])
  
  const ci_fill = ci_dist.append("path")
    .attr("mask", "url(#nhst-mask)")
    .style("stroke", "none").style("fill", "lightblue")
  
  const ci_curve = ci_dist.append("path")
    .attr("class", "invertable")
    .style("stroke", "black")
    .style("fill", "none")
    .style("stroke-width", 4)
  
  const ci_line = ci_dist.append("line")
    .style("stroke", "black").style("stroke-dasharray", [5,5])
  const ci_point_estimate = ci_dist.append("line")
    .style("stroke", "black").style("stroke-dasharray", [5,5])
    
  const p_value = svg.append("text")
    .attr("transform", `translate(${w * 0.75}, ${h * 0.25})`)
    .style("color", "var(--text-color)")
    .attr("class", "math")

  const inputCohensD =  d3.select("#input-cohens-d")
  const inputN =  d3.select("#input-nhst-n")
  
  function initialize() {
    cohensD = Number(inputCohensD.property("value"));
    n = Number(inputN.property("value"));
    std_err = sigma / Math.sqrt(n);
    updateCI();
  }
  

  inputCohensD.on("input", function() {
    initialize();
    d3.select("#value-cohens-d").text(cohensD);
  })
  inputN.on("input", function() {
    initialize();
    d3.select("#value-nhst-n").text(n);
  })

  function updateCI() {
    null_fill.attr("d", line(makeFill(0, std_err)));
    null_curve.attr("d", line(makeCurve(0, std_err)));
    null_rect
      .attr("x", x(jStat.normal.inv(0.025, 0, std_err)))
      .attr("width", x(jStat.normal.inv(0.975, 0, std_err)) -
                     x(jStat.normal.inv(0.025, 0, std_err)))
    null_mu.attr("y2", y(jStat.normal.pdf(0, 0, std_err)))
    null_margin
      .attr("transform", `translate(0, ${y(jStat.normal.pdf(jStat.normal.inv(0.025, 0, std_err), 0, std_err))})`)
      .attr("x1", x(0))
      .attr("x2", x(jStat.normal.inv(0.975, 0, std_err)))
                     
    var ci_lim_lower = jStat.normal.inv(0.025, cohensD, std_err);
    var ci_lim_upper = jStat.normal.inv(0.975, cohensD, std_err);
    var ci_lims = [ci_lim_lower, ci_lim_upper];
    
    var line_height = jStat.normal.pdf(ci_lims[0], cohensD, std_err) / 2;

    ci_fill.attr("d", line(makeFill(cohensD, std_err)))
    ci_curve.attr("d", line(makeCurve(cohensD, std_err)))
    mask_rect
      .attr("x", x(ci_lims[0]))
      .attr("width", x(ci_lims[1]) - x(ci_lims[0]));

    ci_line
      .attr("transform", `translate(0, ${y(line_height)})`)
      .attr("x1", x(ci_lims[0])).attr("x2", x(cohensD));
    ci_point_estimate
      .attr("transform", `translate(${x(cohensD)}, 0)`)
      .attr("y1", y(jStat.normal.pdf(cohensD, cohensD, std_err))).attr("y2", y(0))
      
      
    var significant = ci_lims[0] > 0;
    var p_value = d3.select("#p-value");
    if (significant) {
      ci_fill.style("filter", "grayscale(0)");
      p_value.text("< .05 🥳");
    } else {
      ci_fill.style("filter", "grayscale(1)");
      p_value.text("> .05");
    }
  }
  
  d3.select("#h0").on("input", showSelected);
  d3.select("#h1").on("input", showSelected);
  d3.select("#both").on("input", showSelected);
  
  function showSelected() {
    var h0 = d3.select("#h0").property("checked");
    var h1 = d3.select("#h1").property("checked");
    var both = d3.select("#both").property("checked");
    var selected = [false, false];
    if (h0 || both) { selected[0] = true; }
    if (h1 || both) { selected[1] = true; }
    null_dist.classed("hide-element", !selected[0]);
    ci_dist.classed("hide-element", !selected[1]);
    d3.select("#p-value-container").classed("hide-element", !both)
  }
  
  initialize();
}