Hypothesis testing
Drive: https://drive.google.com/drive/folders/1XSJoE-3rCDDJ6S4mRb8rYCgw5CQ_PYxP?usp=share_link
Problem Statement Paired dataset:
Suppose we have a dataset of 20 students who took a math test before and after attending a tutoring program. We want to test if the tutoring program had a significant effect on the students' math scores. Our null hypothesis is that there is no significant difference between the mean scores before and after the tutoring program, and our alternative hypothesis is that there is a significant difference.
Data is stored in a file called "math_scores.csv" with two columns: "pre_tutoring_scores" and "post_tutoring_scores".
Load the data:
math_scores <- read.csv("math_scores.csv")
Calculate the sample mean and standard deviation for both pre-tutoring and post-tutoring scores:
pre_tutoring_mean <- mean(math_scores$pre_tutoring_scores)
pre_tutoring_sd <- sd(math_scores$pre_tutoring_scores)
post_tutoring_mean <- mean(math_scores$post_tutoring_scores)
post_tutoring_sd <- sd(math_scores$post_tutoring_scores)
Conduct the t-test using the t.test() function:
t.test(math_scores$pre_tutoring_scores, math_scores$post_tutoring_scores, paired=TRUE, alternative="two.sided", conf.level=0.95)
In this function, we specify the two columns of data to compare. The paired=TRUE argument tells R that the samples are paired. The alternative="two.sided" argument specifies that we are testing for a two-tailed alternative hypothesis. The conf.level argument specifies the confidence level for the confidence interval.
Interpret the results:
Write in your language
Same for unpaired data set consider pre_tutoring_scores" and "post_tutoring_scores are perform on different student.
Comments
Post a Comment