What is the relevant out come of having two or more testing on a particular research experimental design?
In: Math
3. Use `sample()` to generate rolls from biased coin with $Pr(Head) = 0.6$ .
i) get a sample of size 10 tosses and tally the results
ii) get a sample of size 30 tosses and tally the results
iii) get a sample of size 100 tosses and tally the results
iv) what do you notice with the proportion of heads in each
sample?
### Code chunk
```{r}
# star your code
# last R code line
```
In: Math
For this exercise, you will need to use the package `mosaic` to find numerical and graphical summaries.
```{r warning=FALSE, message=FALSE}
# install packages if necessary
if (!require(mosaic)) install.packages(`mosaic`)
if (!require(dplyr)) install.packages(`dplyr`)
if (!require(gapminder)) install.packages(`gapminder`)
# load the package in R
library(mosaic) # load the package mosaic to use its
functions
library(dplyr) # load the package dplyr to use its functions
library(gapminder) # load the package gapminder for question
1
```
1. Using the gapminder data in the lesson, do the following:
i) use `filter` to select all countries with the following
arguments:
a) life expectancy larger than 60 years.
b) United Kingdom and Vietnam and years greater than 1990.
ii) use `arrange` and `slice` to select the countries with the top
15 GDP per capital `gdpPercap`. Use the pipe `%>%` operator to
string multiple functions.
iii) use `mutate` to create a new variable called
`gdpPercap_lifeExp` which is the quotient of `gdpPercap` and
`lifeExp` and display the output.
iv) use `summarise` to find the average or mean value of the
variable `gdpPercap_lifeExp` created in part (iii).
v) use `group_by` to group the countries by `continent`; and
`summarise` to compute the average life expectancy `lifeExp` within
each continent. Use the pipe `%>%` operator to string multiple
functions.
### Code chunk
```{r}
# load the necessary packages
library(mosaic)
library(dplyr)
library(gapminder)
# last R code line
```
In: Math
2. The data set `MLB-TeamBatting-S16.csv` contains MLB Team Batting
Data for selected variables. Load the data set from the given url
using the code below. This data set was obtained from [Baseball
Reference](https://www.baseball-reference.com/leagues/MLB/2016-standard-batting.shtml).
* Tm - Team
* Lg - League: American League (AL), National League (NL)
* BatAge - Batters’ average age
* RPG - Runs Scored Per Game
* G - Games Played or Pitched
* AB - At Bats
* R - Runs Scored/Allowed
* H - Hits/Hits Allowed
* HR - Home Runs Hit/Allowed
* RBI - Runs Batted In
* SO - Strikeouts
* BA - Hits/At Bats
* SH - Sacrifice Hits (Sacrifice Bunts)
* SF - Sacrifice Flies
Using the `mlb16.data` data, do the following:
i) use `filter` to select teams with the following arguments:
a) Cardinals team `STL`.
b) teams with Hits `H` more than 1400 last 2016 season.
c) team league `Lg` is National League `NL`.
ii) use `arrange` to select teams in decreasing number of home runs
`HR`.
iii) use `arrange` to display the teams in decreasing number of
`RBI`.
iv) use `group_by` to group the teams per league; and `summarise`
to compute the average `RBI` within each league. Use the pipe
`%>%` operator to string multiple functions.
### Code chunk
```{r}
# load the data set
mlb16.data <-
read.csv("https://raw.githubusercontent.com/jpailden/rstatlab/master/data/MLB-TeamBatting-S16.csv")
str(mlb16.data) # check structure
head(mlb16.data) # show first six rows
# last R code line
```
In: Math
Suppose we would roll two standard 6-sided dice.
(a) Compute the expected value of the sum of the rolls.
(b) Compute the variance of the sum of the rolls.
(c) If X represents the maximum value that appears in the two rolls, what is the expected value of X? What’s the probability of sum = 7?
In: Math
How does assess data stewardship considerations related to data? And how does data related issues are identified, managed, and resolved?
In: Math
Air traffic controllers perform the vital function of regulating the traffic of passenger planes. Frequently, air traffic controllers work long hours with little sleep. Researchers wanted to test their ability to make basic decisions as they become increasingly sleep deprived. To test their abilities, a sample of 6 air traffic controllers is selected and given a decision-making skills test following 12-hour, 24-hour, and 48-hour sleep deprivation. Higher scores indicate better decision-making skills. The table lists the hypothetical results of this study.
Sleep Deprivation | ||
---|---|---|
12 Hours | 24 Hours | 48 Hours |
24 | 18 | 17 |
19 | 23 | 21 |
35 | 23 | 23 |
28 | 21 | 14 |
23 | 15 | 17 |
22 | 22 | 15 |
(a) Complete the F-table. (Round your answers to two decimal places.)
Source of Variation |
SS | df | MS | Fobt |
---|---|---|---|---|
Between groups |
||||
Between persons |
||||
Within groups (error) |
||||
Total |
In: Math
How much does a sleeping bag cost? Let's say you want a sleeping bag that should keep you warm in temperatures from 20°F to 45°F. A random sample of prices ($) for sleeping bags in this temperature range is given below. Assume that the population of x values has an approximately normal distribution.
35 | 110 | 65 | 90 | 90 | 35 | 30 | 23 | 100 | 110 |
105 | 95 | 105 | 60 | 110 | 120 | 95 | 90 | 60 | 70 |
(a) Use a calculator with mean and sample standard deviation keys to find the sample mean price x and sample standard deviation s. (Round your answers to two decimal places.)
x = | $ |
s = | $ |
(b) Using the given data as representative of the population of
prices of all summer sleeping bags, find a 90% confidence interval
for the mean price μ of all summer sleeping bags. (Round
your answers to two decimal places.)
lower limit | $ |
upper limit | $ |
In: Math
The method of tree ring dating gave the following years A.D. for an archaeological excavation site. Assume that the population of x values has an approximately normal distribution.
1194 | 1292 | 1285 | 1292 | 1268 | 1316 | 1275 | 1317 | 1275 |
(b) Find a 90% confidence interval for the mean of all tree ring dates from this archaeological site. (Round your answers to the nearest whole number.)
lower limit | A.D. |
upper limit | A.D. |
In: Math
In a sample of 100 pigs from a large population the following gains in weight (kg) during a 50 day interval were recorded: | |||||||||
36 | 23 | 25 | 21 | 28 | 17 | 35 | 32 | 39 | 30 |
7 | 31 | 24 | 26 | 47 | 30 | 30 | 19 | 39 | 22 |
29 | 36 | 43 | 21 | 34 | 57 | 33 | 36 | 26 | 44 |
41 | 19 | 23 | 41 | 11 | 41 | 45 | 33 | 33 | 33 |
13 | 35 | 18 | 26 | 42 | 30 | 33 | 18 | 26 | 31 |
37 | 34 | 22 | 40 | 37 | 18 | 40 | 14 | 43 | 28 |
30 | 42 | 49 | 27 | 15 | 31 | 29 | 29 | 12 | 16 |
48 | 27 | 28 | 20 | 30 | 46 | 19 | 53 | 29 | 24 |
17 | 21 | 25 | 35 | 42 | 31 | 34 | 38 | 20 | 38 |
30 | 26 | 39 | 24 | 33 | 32 | 27 | 25 | 30 | 30 |
b. What's the prob of randomly selecting a pig that added at least 44kg to its weight during the test? How does this predicted number (predicted proportion/percent) compare with the actual number? *Hint: remember there are 100 total samples*
c. What's the probability that a pig would increase no less than 10kg and no more than 47kg?
d. Construct a 99% confidence interval for this data.
In: Math
Year |
Tornadoes |
Census |
1953 |
421 |
158956 |
1954 |
550 |
161884 |
1955 |
593 |
165069 |
1956 |
504 |
168088 |
1957 |
856 |
171187 |
1958 |
564 |
174149 |
1959 |
604 |
177135 |
1960 |
616 |
179979 |
1961 |
697 |
182992 |
1962 |
657 |
185771 |
1963 |
464 |
188483 |
1964 |
704 |
191141 |
1965 |
906 |
193526 |
1966 |
585 |
195576 |
1967 |
926 |
197457 |
1968 |
660 |
199399 |
1969 |
608 |
201385 |
1970 |
653 |
203984 |
1971 |
888 |
206827 |
1972 |
741 |
209284 |
1973 |
1102 |
211357 |
1974 |
947 |
213342 |
1975 |
920 |
215465 |
1976 |
835 |
217563 |
1977 |
852 |
219760 |
1978 |
788 |
222095 |
1979 |
852 |
224567 |
1980 |
866 |
227225 |
1981 |
783 |
229466 |
1982 |
1046 |
231664 |
1983 |
931 |
233792 |
1984 |
907 |
235825 |
1985 |
684 |
237924 |
1986 |
764 |
240133 |
1987 |
656 |
242289 |
1988 |
702 |
244499 |
1989 |
856 |
246819 |
1990 |
1133 |
249623 |
1991 |
1132 |
252981 |
1992 |
1298 |
256514 |
1993 |
1176 |
259919 |
1994 |
1082 |
263126 |
1995 |
1235 |
266278 |
1996 |
1173 |
269394 |
1997 |
1148 |
272647 |
1998 |
1449 |
275854 |
1999 |
1340 |
279040 |
2000 |
1075 |
282224 |
2001 |
1215 |
285318 |
2002 |
934 |
288369 |
2003 |
1374 |
290447 |
2004 |
1817 |
293191 |
2005 |
1265 |
295895 |
2006 |
1103 |
298754 |
2007 |
1096 |
301621 |
2008 |
1692 |
304059 |
2009 |
1156 |
308746 |
2010 |
1282 |
309347 |
2011 |
1691 |
311722 |
2012 |
938 |
314112 |
2013 |
907 |
316498 |
2014 |
888 |
318857 |
Is the number of tornadoes increasing? In the last homework, data on the number of tornadoes in the United States between 1953 and 2014 were analyzed to see if there was a linear trend over time. Some argue that it’s not the number of tornadoes increasing over time, but rather the probability of sighting them because there are more people living in the United States. Let’s investigate this by including the U.S. census count (in thousands) as an additional explanatory variable (data in EX11-24TWISTER.csv).
Fit one SLR model with year as the predictor, another SLR model with census count as the predictor. Write down the two models. Are year and census count significant, respectively?
In: Math
Consider a Poisson distribution in which the offspring distribution is Poisson with mean 1.3. Compute the (finite-time) extinction probabilities un = P{ Xn = 0 | X0 = 1 } for n = 0, 1, . . . , 5. Also compute the probability of ultimate extinction u∞.
In: Math
Construct the confidence interval for the population standard deviation for the given values. Round your answers to one decimal place.
n=20, s=4.2, and c=0.99
In: Math
In the 1996 General Social Survey, for males age 30 and over, the following was true about respondents: • 11% of those in the lowest income quantile were college graduates. • 19% of those in the second income quantile were college graduates. • 31% of those in the third income quantile were college graduates. • 53% of those in the highest income quantile were college graduates. Find P(Q1|G), the probability that a randomly selected college graduate falls in the lowest income quartile. Also find P(Q2|G), P(Q3|G), and P(Q4|G). Discuss how this distribution compares to the unconditional distribution P(Q1), P(Q2), P(Q3), P(Q4)
In: Math
Sales personnel for Skillings Distributors submit weekly reports listing the customer contacts made during the week. A sample of 85 weekly reports showed a sample mean of 17.5 customer contacts per week. The sample standard deviation was 5.7 . Provide 90% and 95% confidence intervals for the population mean number of weekly customer contacts for the sales personnel.
90% confidence interval, to 2 decimals:
_____ , _____
95% confidence interval, to 2 decimals
_____ , _______
In: Math