Question

In: Math

According to the text, Statistics is the science of planning studies and experiments, obtaining data, and...

According to the text, Statistics is the science of planning studies and experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data. Write a paper at least 3 pages long detailing how specific concepts covered in this class fit into the definition and how they can apply to your area of study.

The concepts that need to be covered are the five-number summary, probability distributions, and hypothesis testing. For each concept be sure to explain the details of the concept and its importance, how it fits into the definition of statistics, and give a detailed example of how that concept can be used in your field of study. You can, but don’t have to, research your field of study to find examples. If you do, be sure to cite your sources.

Write the paper with a strong opening paragraph that makes the reader interested in reading the rest of the paper. It should have a strong closing paragraph that ties the paper together. In other words, don’t just answer questions, write a paper that informs and happens to cover the questions along the way. If you use your text or other sources for information in your paper be sure to cite those sources and include a works cited page at the end.

Solutions

Expert Solution

Statistics:

Statistics is a branch of science it define like  obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data.

"You can Convince people by Numbers only not by words"

Assume you have some information like in USA the people average sleeping time has 8 hours if you pass this statements people will not listen you, suppose if you pass this statements with numbers with analysis then people will listen you, that much powerful tool is statistics.

Generally in statistics we use to draw conclusions either population or sample these methods are called Differential and inferential statistics, in both methods our goal is to find some statistic(sample) or parameter(population)

Collect the data:

The same example statement to analyze the first step is collect the data, we should gather data from different source like name,age,gender,zip code etc

Organize the data:

After gathering data ,the data may be is not in structured format, some data is missing so we should organize the data in good format like structured format.

Analyze:

Now this is important step over data analyze, there are many techniques to analyze the data and draw conclusions from the data.

we should answer all type of questions below i have written......

(i) what is the types of data variables Numerical or Categorical

(ii) suppose after gather data the data may contain missing values, then you should impute missing values by using mean,median,mode

(iii)before that you should analyze central tendency of data like mean,mode and median

(iv)how the data is distributed ,is any outliers in the data? if it has outliers how you deal?

(v)which transformation technique you will apply on outlier concept example: log,square root etc

(vi) what is the standard deviation interpretation of data how can you justification empirical rule?

example:1

suppose if a house broker wants to show a house prices like this ...

h1=1cr.h2=2cr,h3=3cr,h4=4cr,h5=50 cr

if you apply add h5 house price to data and calculate mean of house price = (1+2+3+4+50)/5=60/5=12cr

if you remove h5 fro data then calculate mean= (1+2+3+4)/4=10/5=2.5

see the magic....this is because h5 house price is so high compare to other prices, it is like an outlier so data is skewd...for this type we will not use mean, we go for median

like this many insights we should draw from data

example=2:

why we go for Standard deviation instead of Variance?

we have so many dispersion variables are there like range,mean deviation,absolute deviation ,variance, and SD

every thing has some disadvantages let see....

a) Range:

suppose 1,2,3,4,5,6,........................100

range=(100-1)=99 --------------------------------> draw back is not consider middile values

b) Mean deviation:

1,2,3,4,5----------> mean =3

(1-3)+(2-3)+(3-3)+(4-3)+(5-3)=0 if add all deviations it gives zero

c)Absolute Deviation:

To avoid this zero we go for modulus concept

But |x| is discontinue at zero , so for math calculations it will not prefer

d) Variance:

Next we go for Variance , variance will avoid draw back of both Deviation and Absolute Deviation but due to square term it automatically square the units also, to avoid this draw back we go for Standard deviation

like this statistics is very beautiful subject to analyze and draw conclusions on data

Apart from this Statistical analysis depends on Distributions and Hypothesis.

Distributions:

Example:

In statistical experiments involving chance, outcomes occur randomly. As an example of such an experiment, a teacher randomly selects three students from a large batch of students to be tested for pass the exam

Each selected student is to be rated as good or poor. The students are numbered from 1 to 3, a poor student is designated with a P, and a good Student is designated with a G.The expression P1 G2 P3 denotes one particular outcome in which the first and third students are poor and the second student is good. In this chapter, we examine the probabilities of events occurring in experiments that produce discrete distributions. In particular, we will study the binomial distribution, the Poisson distribution, and the hyper-geometric distribution

like Discrete distributions we have Continuous distributions

Continuous Distributions:

Continuous distributions are constructed from continuous random variables in which values are taken on for every point over a given interval and are usually generated from experiments

Continuous Distributions-----> Experiments "Measured

Discrete Distributions----------> Experiments "Counted"

In continuous Distributions we calculated area, the area under curve=1

The many continuous distributions in statistics include the uniform distribution, the normal distribution, the exponential distribution, the t distribution, the chi-square distribution, and the F distribution.

Sampling and Sampling distributions:

Here we calculate population parameter like mean and standard deviation by using sample statistic, here we will draw different samples for each sample we will get some statistic, we will make probability distributions for this statistic called sample distributions.

we will estimate the population mean by using Z and t-statistic, in this we will learn point estimate and interval estimate also called confidence interval.

explanation:

A point estimate is a statistic taken from a sample that is used to estimate a population parameter. A point estimate is only as good of its sample.for each sample the sample statistic may vary o we go for interval estimate/

CI=point estimate*2(Standard Error)

Hypothesis Testing:

hypothesis means weather or not the claim is valid . we conduct the hypothesis about population not about sample .we are testing the probability of those assumptions,if probability of assumptions rare enough or less they are probably wrong This is called RARE EVENT RULE .

example:

statistics never prove right, we can not say the person is innocent we can say NOT GUILTY,

so why we have two NULL HYPO:

a)rejecting Null hypo: I have enough evidence to prove Ho is wrong

b)fail to reject Null hypo: I dont have enough evidence to prove Ho is wrong

evidence come from sample data, fail to reject means not Accepting, may be he did crime but not enough evidence to claim,not guilty means might not be an innocent Alternative Hypo:

we have to make decision by using evidence, for that we should know about Significance level Significance level: alfha=0.001 C.I =90%,alfha=0.05 C.I=95%, alfha=0.10 C.I=99%

Critical Values:

separate the Rejection region and fails to rejection region Alfah=0.05 if right tail test Z=1.645 Rejection Region: If our test statistic z falls in this region, we can reject Null hypo,compare critical and test statistic both are in same units like Z-score or probability or mean intervals

P-value Method:

probability associated with test statistic, associated with Z , in traditional method

(i)according to significance level we make critical value

(ii)after by sample we find test statistic

(iii)next we compare both critical and test statistic but in this method we attach with only test stat , we find the area according to test statistic, this area is P-value

(iv) we compare this p with alfha

(v) p<alfha reject Ho

(vi) instead of compare critical and test statistic we compare corresponding area or probability associated with

alfha--->area associated with critical, p---->area associated with test statistic

(vii) reject p<=alfha, fail to reject p>alfha ,

p-value more and more means that area might be come in to non rejection region

Like this we have many statistic techniques has used to get good insights from data


Related Solutions

According to the IAASB, data analytics or big data is the “science and art of discovering...
According to the IAASB, data analytics or big data is the “science and art of discovering and analysing patterns, deviations and inconsistencies, and extracting other useful information in the data underlying or related to the subject matter of an audit through analysis, modelling and visualisation for the purpose of planning and performing the audit” You are required to: Critically analyse and demonstrate how the definition outlined by IAASB can be applied in the audit process to enhance audit quality.
According to the text, recent psychological studies have shown that coercive power in managers is becoming...
According to the text, recent psychological studies have shown that coercive power in managers is becoming less and less effective. Many managers now use authoritarian styles of managing where they force compliance on their subordinates. However, a lot of managers are still trying to use more charismatic styles to try and build relationships and build a “safe space” for their employees. Overall, which do you think is more effective for a company? Why?
Obtaining Data. Discuss the methods for obtaining data. Describe each method.
Obtaining Data. Discuss the methods for obtaining data. Describe each method.
Why is the science of statistics important?
Why is the science of statistics important?
Statistics has been defined as "the science of learning from data." What are some of the...
Statistics has been defined as "the science of learning from data." What are some of the data with which you will be confronted your field of study? How are such data typically obtained and analyzed? In the example(s) that you provided in response to the preceding question, what were some deficiencies in the data gathering and analysis and how might these be overcome?
Consider the following natality statistics for the U.S. population in 1992. According to these data, the...
Consider the following natality statistics for the U.S. population in 1992. According to these data, the probabilities that a randomly selected woman who gave birth in 1992 was in each of the following age groups are as follows: Age (years) Probability < 15 0.003 15 – 19 0.124 20 – 24 0.263 25 – 29 0.290 30 – 34 0.220 35 – 39 0.085 40 – 44 0.014 45 – 49 0.001 1. What is the probability that a woman...
What is Data Science? How can you relate the term ‘e-science’ to data science? A data...
What is Data Science? How can you relate the term ‘e-science’ to data science? A data scientist typically performs 3 tasks. What are they?
Discuss the characteristics of qualitative research studies as opposed to the characteristics of obtaining knowledge via...
Discuss the characteristics of qualitative research studies as opposed to the characteristics of obtaining knowledge via tradition, authority, intuition, and practice wisdom methods.
Suppose that the probability of obtaining a particular grade in an undergraduate statistics course, is defined...
Suppose that the probability of obtaining a particular grade in an undergraduate statistics course, is defined by the following table: grade A B C D F probability .25 .35 .2 .15 .05 (a) Using the usual numerical values for the grades, define the corresponding random variable, X, and its probability mass function, p(x). (b) Calculate P(X ≤ 2), P(X < 2), and P(X ≥ 3). (c) Plot the cumulative distribution function F(x). (d) Compute the mean µ = E(X).
What evidence is there to show that Asch’s classic studies on line judgement are experiments on...
What evidence is there to show that Asch’s classic studies on line judgement are experiments on conformity?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT