In: Statistics and Probability
1. Create null and alternative hypotheisis pairs (in words and symbols) that are always both mutually exclusive and collectively exahustive.
2. how to use a conditional probability to determine if events are independent or dependent.
3.describe a sampling distribution of the mean. as outlined in the central limit theorem. (shape, Average and standard error)
1.) Let's explain it with the help of an example:
Suppose, assume that we are going to decide the colour of our website based on the number of clicks it receives. And the number of clicks in our website is 30 per day for the default colour. Now, if the number of clicks in our website is going to increase by changing its colour, we decide to keep the colour permanently. Otherwise, we will stick to our default one.
Here, we need to decide whether to change the colour of the website based on the sample data that we have collected, and its mean is 63.875 per day. We also understand that we can’t make a decision just by comparing two numbers (63.875 per day > 30 per day) because number of clicks is a random variable. And this is where hypothesis testing becomes handy. It tackles these issues in an intelligent way and uses the sample data to make a decision. In other words, hypothesis testing uses sample data to make an inference about the population parameter.
In our problem, the null and alternative hypothesis are
H0: Changing the colour of the website doesn’t influences the number of clicks which it receives.
H1: Changing the colour of the website influences the number of clicks which it receives.
Mathematically it is equivalent to saying
Please note the following things in our hypothesis
2.) In probability, we say two events are independent if knowing one event occurred doesn't change the probability of the other event.
Two events, A and B, are independent if P(A|B) = P(A) and P(B|A) = P(B)
The probability of A, given that B has happened, is the same as the probability of A. Likewise, the probability of B, given that A has happened, is the same as the probability of B. This shouldn’t be a surprise, as one event doesn’t affect the other.
You can use the following equation to figure out probability for
independent events:
P(A?B) = P(A) · P(B).
3.)
Central Limit Theorem:
The Central Limit Theorem states that the sampling distribution of the sample means approaches a normal distribution as the sample size gets larger — no matter what the shape of the population distribution. This fact holds especially true for sample sizes over 30. All this is saying is that as you take more samples, especially large ones, your graph of the sample means will look more like a normal distribution.
Roughly stated, the central limit theorem tells us that if we have a large number of independent, identically distributed variables, the distribution will approximately follow a normal distribution. It doesn’t matter what the underlying distribution is.
The picture below shows one of the simplest types of test: rolling a fair die. The more times you roll the die, the more likely the shape of the distribution of the means tends to look like a normal distribution graph.
Sampling Distribution of Sample Mean
The central limit theorem states that: Given a population with a finite mean ? and a finite non-zero variance ?2, the sampling distribution of the mean approaches a normal distribution with a mean of ? and a variance of ?2/N as N, the sample size, increases.
Mean
The mean of the sampling distribution of the mean is the mean of the population from which the scores were sampled. Therefore, if a population has a mean ?, then the mean of the sampling distribution of the mean is also ?. The symbol ?Mis used to refer to the mean of the sampling distribution of the mean. Therefore, the formula for the mean of the sampling distribution of the mean can be written as:
?M = ?
Variance
The variance of the sampling distribution of the mean is computed as follows:
That is, the variance of the sampling distribution of the mean is the population variance divided by N, the sample size (the number of scores used to compute a mean). Thus, the larger the sample size, the smaller the variance of the sampling distribution of the mean.
Standard Error
he standard error of the mean is the standard deviation of the sampling distribution of the mean. It is therefore the square root of the variance of the sampling distribution of the mean and can be written as:
The standard error is represented by a ? because it is a standard deviation. The subscript (M) indicates that the standard error in question is the standard error of the mean.
Shape:
As stated above in example, the shape follows the normal distribution Bell Curve Shape