Question

In: Statistics and Probability

Instructions: Choose a quantitative variable to study, with values such that the range is greater than...

Instructions:

  1. Choose a quantitative variable to study, with values such that the range is greater than or equal to 10. An example that you can't use might be "How many pairs of shoes do you own?" Do not give answer choices but use the answers you collect for your data. Note that the range (the largest number of shoes minus the smallest number of shoes) must be greater than or equal to 10. Make a plan that describing the population, sample size, type of sampling (no convenience sampling) describing how it was conducted, type of survey and how it was conducted, question to ask or what to observe. Then conduct a survey or observation and collect data consisting of at least 30 values.
  2. Find the 5-number summary and construct a boxplot. Find the mean and standard deviation.

Turn in:

1. Typed paragraph with your name and answers to the following questions: What is the quantitative variable you studied and the range of its values? What is your population and what is the size of your sample? Name the type of sampling you used and explain how you conducted the sampling. Name the type of survey you used and explain how you conducted your study. What question did you ask in your survey or how did you conduct your observations? (For your typed paragraph, do not write a list of questions and answers. The paragraph needs to be written in an essay format. For example: "In this survey, the population was xxxx, and the sample size was xxxx..." You will be graded on providing appropriate answers to each question and the clarity, quality, explanation of each answer).

2. Include a sheet with your name, list of raw data, 5-number summary, boxplot, mean, and standard deviation. Show all work to find the numbers in the 5-number summary.

My survey results were as follows:

How much money do you spend weekly on groceries?

$25 or less: 5 people

$50 or less: 19 people

$100 or less: 3

$150 or less:3

Solutions

Expert Solution

Quantitative data are measures of values or counts and are expressed as numbers . Some examples of quantitative data are your height, your weight , your shoe size, and the length of your fingernails , your exam score etc.

Suppose we choose weight of group of boys in city A between age 18 - 20 years .

We wish to check weight of Boys of age 18 - 20 years and check how many boys are health , overweight or underweight .

" In this survey, the population was of 200 boys in city A , and the sample size was 30 . "

So a simple random sample of size 30 is drawn from population of 200 boys between age 18 - 20 years from city A .

We need to estimate weight of boys for age 18 - 20 in city using sample of size 30.

It is conducted by using data of weights of boys (in Kg) by taking simple random samples of size 30.

Among 200 boys 30 Boys are selected randomly and survey is conducted .

{Note - since cant conduct survey here , so we need to generate some data for our survey . Here you can use statistical table table to generate random data or you can use software like R , excel ect to generate random number and to get samples from that numbers . }

I will use R - Software to generate random data and to collect random samples from generated data.

R-code

{

# to generate data we use runif() so that data is generated uniformly between 40kg to 110 kg         >G_Data=runif(200,40,110)            # generated data for survey

#Let us round off our data to zero decimals
>Population=round(G_Data,0)       # Will be our data for weight of all boys

> Population                                   # can be used as our raw data

[1] 91 89 45 75 109 80 96 82 82 44 98 87 48 58 60 97 106 102
[19] 72 61 68 57 70 84 51 82 44 47 110 65 101 102 101 100 45 64
[37] 46 67 53 44 91 99 76 55 63 49 57 80 95 76 78 95 42 94
[55] 110 102 52 55 44 43 82 78 61 89 104 65 47 66 59 42 77 69
[73] 48 88 62 66 76 67 99 49 105 74 83 44 93 91 83 108 108 106
[91] 81 72 72 91 76 88 63 46 69 70 62 88 75 73 51 98 48 50
[109] 88 95 91 103 55 45 47 108 92 108 75 97 59 84 52 99 46 103
[127] 47 76 71 56 72 60 91 76 95 47 66 70 69 46 57 108 88 96
[145] 70 110 72 93 87 100 86 98 69 96 67 70 87 71 42 88 96 53
[163] 64 105 65 100 98 54 82 87 61 76 71 51 100 50 62 49 94 87
[181] 89 66 48 62 93 104 49 44 92 109 81 75 107 67 93 66 81 86
[199] 103 87

#Now we calculate range of our data   as to verify it is greater than 10

> max(Population)                   # highest observation
[1] 110

> min(Population)                    # lowest observations
[1] 42

# note : Range is given by highest observation - lowest observations

>Range=max(Population) - min(Population)       

> Range
[1] 68

}

Thus our range is greater than 10 , Our variable " Population " created will act as population of our data

Now we will take a sample survey using simple random sampling of size 30 .

From R

{

# sample("data",n ) is used from drawing sample of size n in R

{

> spl=sample(Population , 30)      # to draw 30 samples from population

> spl                                          # samples of size 30
[1] 47 91 68 44 63 42 87 48 89 70 105 100 110 101 65 50 89 69 44
[20] 76 87 91 91 106 70 75 58 76 74 97

}

So these is our samples of Weights (Kg) of boys used for or survey .

Sr No. Weight Sr No. Weight Sr No. Weight
1 47 11 105 21 87
2 91 12 100 22 91
3 68 13 110 23 91
4 44 14 101 24 106
5 63 15 65 25 70
6 42 16 50 26 75
7 87 17 89 27 58
8 48 18 69 28 76
9 89 19 44 29 74
10 70 20 76 30 97

To compute 5-number summary, box-plot, mean, and standard deviation

i)

A 5-number summary consists of five values: the most extreme values in the data set (the maximum and minimum values), the lower and upper quartiles, and the median.

From data given above we will sort it in ascending order ( n = 30 even )

42 44 44 47 48 50 58 63 65 68 69 70 70 74 75

76 76 87 87 89 89 91 91 91 97 100 101 105 106 110

From sorted data we can see

Extreme values - maximum values = 110

                        minimum values   = 42

Since it is even data

Median = { ( n/2) th obs + (n/2 + 1 )th obs } /2 = { ( 15 ) th obs + (16 )th obs } /2

                                                                    = { 75 + 76 } / 2

                                                                    = 75.5

Lower quartiles = Median of data below Median value i.e median of values less than 75.5 ( n = 15 odd )

data less than 75.5 :-   42 44 44 47 48 50 58 63 65 68 69 70 70 74 75

Lower Quartiles = ( 15 +1 ) /2 thobs = 8 thobs = 63

Lower Quartiles = 63

Upper quartiles = Median of data above Median value i.e median of values greater than 75.5 ( n = 15 odd )

data greater than 75.5 :-   76 76 87 87 89 89 91 91 91 97 100 101 105 106 110

Upper quartiles = ( 15 +1 ) /2 thobs = 8 thobs = 91

Upper quartiles = 91

   

5-number summary is

Minimum Lower Quartile Median Upper Quartile Maximum
42 63 75.5 91 110

ii)

To draw box plot of samples

{Note we be drawn manually , or using software directly as follow }

{

>SR_No=1:30
>barplot(spl,col=2,names.arg=SR_No,xlab="Sample_No",ylab="Weight")

# to draw histogram

>hist(spl,col=2,xlab="Weight")

Note this Bar-Plot and Histogram can be drawn manually .

iii) Mean

mean =                            ; n =30

where xi is samples drawn above

mean = ( 47 + 91 + 68 + 44 + 63 + .....+ 75 + 58 + 76 + 74 + 97) /30

          = 2283 /30

          = 76.1

Thus Mean = 76.1

iv) Standard deviation

Standard deviation =

var(x) =               ; where n = 30   and = 76.1

          = { ( 47 - 76.1 )2 + ( 91 - 76.1 )2 + ( 68 - 76.1 )2 + ( 44 - 76.1 )2 + ....... + ( 74 - 76.1 )2 + ( 97- 76.1 )2 } / ( 30 - 1 )

          = 12102.7 / 29

var(x) = 417.3345

Standard deviation = = = 20.42877

Standard deviation = 20.42877

My survey results for samples were as follows :

Weight of boy in city A of age between 18 -20 years

                 50 Kg or less : 5 Boys            ( underweight )

Between 55 kg and 80 Kg : 11 Boys           ( Average or can be consider as healthy )

      90 Kg or More : 9 Boys          ( Overweight )

The Mean estimate of Weight is 76.1 Kg ,

5-number summary is

Minimum Lower Quartile Median Upper Quartile Maximum
42 63 75.5 91 110

Related Solutions

a) Using the field 'Years_in_education_Years' obtain the records filtering the range of years is greater than...
a) Using the field 'Years_in_education_Years' obtain the records filtering the range of years is greater than 16.5 years and less than or equal to 19.5 years (use a filter) im am working with data frames, but please just put the code!
The random variable x is greater than or equal to 50, with a mean of 70,...
The random variable x is greater than or equal to 50, with a mean of 70, and a variance of 36. What is the probability for x less than or equal to 50? What is the probability for x between 80 and 60? . . . The salary of employees is normally distributed with mean of $50,000 annually. Standard deviation of $10,000. What percentage of the employees make between $45,000-$60,000?
If the price was slightly less than average total cost, but still greater than average variable...
If the price was slightly less than average total cost, but still greater than average variable cost, then the profit-maximizing, monopolistically competitive firm would Answer choices: produce an output amount that corresponded to the place where marginal cost equals marginal revenue and break even. produce an output amount where marginal cost equals marginal revenue and make a small profit. continue to produce an output amount that corresponded to the place where marginal cost equals marginal revenue, but make a small...
Determine if the variable is qualitative or quantitative. If quantitative, then also state if the variable...
Determine if the variable is qualitative or quantitative. If quantitative, then also state if the variable is discrete or continuous. a. Weight of a car b. Gender of a person
Write if logic in Python that checks if a variable named number is either greater than,...
Write if logic in Python that checks if a variable named number is either greater than, less than , or equal to 10 and prints out the number with the appropriate message, for example, if number is 6 you would output: number,"less than 10" or 6 less than 10
The proportion of observations from a standard Normal distribution that take values greater than 1.631.63 is...
The proportion of observations from a standard Normal distribution that take values greater than 1.631.63 is about (±±0.001)
Consider the dataset between a quantitative input variable, ? and a quantitative response (output) variable, ?...
Consider the dataset between a quantitative input variable, ? and a quantitative response (output) variable, ? . Which of the following provides an optimal fit between them - a linear model, a complete quadratic model or a complete third order model? (Hint: You can use adjusted multiple coefficient of determination, ?2 to determine the optimal? model. Your answers below must be accompanied by appropriate computation in Excel)?2 value for the linear model = ________________ ?2 value for the quadratic model...
Explain briefly why the transparency range of single-crystal NaCl is much greater than for single- crystal...
Explain briefly why the transparency range of single-crystal NaCl is much greater than for single- crystal MgO.
Ceramics exhibit a greater range of electrical conductivity than any other class of materials. Explain why...
Ceramics exhibit a greater range of electrical conductivity than any other class of materials. Explain why this is the case and provide some examples to illustrate this statement. Include discussion of the temperature dependence of conductivity for your examples.
Assume that price is greater than average variable cost. If a perfectly competitive seller is producing...
Assume that price is greater than average variable cost. If a perfectly competitive seller is producing at an output where price is​ $11 and the marginal cost is​ $14.54 (along the​ upward-sloping portion of the MC​ curve), then to maximize profits the firm should   A. continue producing at the current output. B. produce a smaller level of output.   C. produce a larger level of output.   D. not enough information given to answer the question.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT