Assignment Chef icon Assignment Chef
All English tutorials

Programming lesson

Mastering One-Way ANOVA in R: A STAT1201 Study Guide with MDMA Therapy Example

Learn how to perform and interpret one-way ANOVA in R using the STAT1201 MDMA therapy dataset. This guide covers hypothesis testing, assumptions, and R code with real exam-style questions.

one-way ANOVA STAT1201 analysis of scientific data ANOVA in R MDMA therapy PTSD study CAPS-IV score analysis RStudio ANOVA tutorial hypothesis testing ANOVA ANOVA assumptions normality Tukey HSD post-hoc test residual degrees of freedom total sum of squares ANOVA randomised comparative experiment ANOVA exam preparation data analysis in R statistical methods for science

Introduction: Why ANOVA Matters in Data Analysis

In the field of scientific data analysis, comparing means across multiple groups is a fundamental task. One-way ANOVA (Analysis of Variance) is the go-to statistical method when you have a continuous outcome and a categorical predictor with three or more levels. This tutorial uses the STAT1201 Analysis of Scientific Data exam scenario on MDMA therapy for PTSD to walk you through the entire process—from loading data in RStudio to interpreting ANOVA output. Whether you're preparing for your final exam or just brushing up on ANOVA, this guide provides clear explanations and R code you can adapt to any dataset.

The MDMA and PTSD Study: A Real-World Context

Imagine a study investigating whether MDMA-assisted psychotherapy can reduce PTSD symptoms. Researchers measured CAPS-IV scores (higher = more severe symptoms) before and after treatment. Patients were randomly assigned to three dosage groups: Low (40 mg), Medium (100 mg), and High (125 mg). The primary outcome was the reduction in CAPS-IV score (Change = Before – After). This is a randomised comparative experiment, as confirmed in STAT1201 Question 1. Understanding the study design is crucial: it's not an observational study, and it's not a block design because the only factor is dosage. The random allocation helps ensure groups are comparable, allowing us to attribute differences in mean change to the dosage level.

Loading the Data in RStudio

First, download the MDMA.csv file from your course site and read it into RStudio. Use the following code:

mdma <- read.csv("MDMA.csv")
head(mdma)

The dataset contains variables: Before, After, Change, Drop20 (TRUE/FALSE), and Dose (Low, Medium, High). We'll focus on Change and Dose.

Step 1: Descriptive Statistics

Before running ANOVA, explore the data. Compute mean change per dose group:

aggregate(Change ~ Dose, data = mdma, FUN = mean)

You'll find that the mean reduction in CAPS-IV score varies by dose. For example, the low dose group has a mean change of about 78.5 (as in Question 2). Descriptive statistics give you a preliminary sense of group differences.

Step 2: Checking ANOVA Assumptions

One-way ANOVA relies on three key assumptions: independence of observations, normality of residuals, and homogeneity of variances. Since this is a randomised experiment, independence is plausible. Check normality with a Q-Q plot and Shapiro-Wilk test on residuals. Check equal variances with Levene's test:

# Fit the model
model <- lm(Change ~ Dose, data = mdma)
# Normality of residuals
shapiro.test(residuals(model))
# Homogeneity of variances
library(car)
leveneTest(Change ~ Dose, data = mdma)

If assumptions are violated, consider transformations or non-parametric alternatives like Kruskal-Wallis. In the exam, you may be asked to assume they hold.

Step 3: Performing One-Way ANOVA

Use the aov() function in R:

anova_result <- aov(Change ~ Dose, data = mdma)
summary(anova_result)

The output shows Sum of Squares, degrees of freedom, Mean Square, F-statistic, and p-value. For the MDMA data, the Total Sum of Squares is 297.43 (Question 5). The residual degrees of freedom are 51 (Question 4). The F-test assesses the null hypothesis: the mean change in CAPS-IV score is the same for all dosage levels (Question 3). If the p-value is less than 0.05, you reject the null and conclude at least one group mean differs.

Step 4: Interpreting the Results

In the exam, you might be asked to interpret the F-test. For Question 6, the correct conclusion is: strong evidence to suggest that there is an effect of MDMA dosage level on the mean change in patient’s CAPS-IV score (p < 0.01). This means the data provide strong evidence against the null hypothesis. Always report the p-value range (e.g., p < 0.01, p < 0.05) and avoid saying "prove"—statistical tests provide evidence, not proof.

Post-Hoc Tests

If ANOVA is significant, follow up with post-hoc tests to find which groups differ. Tukey's HSD is common:

TukeyHSD(anova_result)

This will compare Low vs Medium, Low vs High, and Medium vs High. Look for confidence intervals that do not include zero.

Common Pitfalls and Exam Tips

  • Don't confuse null and alternative hypotheses. The null is that all group means are equal; the alternative is that at least one is different (not all different).
  • Degrees of freedom: For a one-way ANOVA with k groups and n total observations, treatment df = k-1, residual df = n-k, total df = n-1. In our example, k=3, n=54, so residual df = 51.
  • Sum of Squares: Total SS = SS_treatment + SS_residual. Check that your numbers add up.
  • Effect size: Consider reporting eta-squared (SS_treatment / SS_total) to measure the proportion of variance explained.

Connecting to Trends: ANOVA in Everyday Data Science

ANOVA isn't just for medical studies. In 2026, data analysts use ANOVA to compare user engagement across multiple app versions, test ad campaign performance across platforms, or evaluate student test scores across different teaching methods. For example, a gaming company might use ANOVA to compare average playtime across three game genres. The logic remains the same: is the variation between groups larger than the variation within groups?

Conclusion

One-way ANOVA is a powerful tool for comparing multiple group means. By working through the STAT1201 MDMA example, you've learned how to load data, check assumptions, run the test, and interpret output in R. Practice with other datasets to solidify your understanding. For your exam, remember the key steps: state hypotheses, check assumptions, compute ANOVA, and draw conclusions based on the p-value. Good luck!

Further Practice

To reinforce learning, try these exercises: (1) Modify the code to use the Drop20 binary outcome with a chi-square test. (2) Simulate your own dataset with three groups and see how ANOVA behaves. (3) Explore two-way ANOVA if you have a second factor like gender.