Posts

Hypothesis testing

Drive:  https://drive.google.com/drive/folders/1XSJoE-3rCDDJ6S4mRb8rYCgw5CQ_PYxP?usp=share_link Problem Statement Paired dataset: Suppose we have a dataset of 20 students who took a math test before and after attending a tutoring program. We want to test if the tutoring program had a significant effect on the students' math scores. Our null hypothesis is that there is no significant difference between the mean scores before and after the tutoring program, and our alternative hypothesis is that there is a significant difference. Data is stored in a file called "math_scores.csv" with two columns: "pre_tutoring_scores" and "post_tutoring_scores".  Load the data: math_scores <- read.csv("math_scores.csv") Calculate the sample mean and standard deviation for both pre-tutoring and post-tutoring scores: pre_tutoring_mean <- mean(math_scores$pre_tutoring_scores) pre_tutoring_sd <- sd(math_scores$pre_tutoring_scores) post_tutoring_mean <- ...

Confidence Interval

Problem Statement : Study and perform foundations for statistical inference to find answers regarding confidence intervals. (Calculating standard error of the mean, finding the t-score, calculating margin of error and constructing the confidence interval.) Code: First, let's start by loading a dataset that we can work with. For the purpose of this exercise, we will use the "mtcars" dataset which is a built-in dataset in R. data(mtcars) Now, let's assume that we want to calculate the confidence interval for the mean miles per gallon (mpg) of cars in the dataset. We can start by calculating the mean and standard error of the mean using the following commands: n <- length(mtcars$mpg)                  # sample size xbar <- mean(mtcars$mpg)              # sample mean s <- sd(mtcars$mpg)                         # sample standard deviation se...

Probability: Normal Distribution

 Normal Distribution DataSet: https://drive.google.com/file/d/1eggLsBKx_RIK8_m0CHLHu2zAtZlrR3aG/view?usp=share_link Generate a Normal Distribution We can use the rnorm() function in R to generate random values from a normal distribution with a specified mean and standard deviation. The syntax of the function is: rnorm (n, mean = 0 , sd = 1 ) where n is the number of values to generate, mean is the mean of the distribution, and sd is the standard deviation of the distribution. For example, to generate 1000 random values from a normal distribution with a mean of 0 and a standard deviation of 1, we can use the following code: set .seed( 123 ) # set a seed for reproducibility x <- rnorm( 1000 , mean = 0 , sd = 1 ) The set.seed() function is used to set a seed value for the random number generator, which ensures that the same set of random numbers will be generated each time the code is run. This can be useful for reproducibility. Plot a Histogram of the Distribution We can us...