In: Statistics and Probability
Some types of cancer are much more common than others. Although genetic and environmental factors contributing to cancer often make the news, cancer may also arise from random mutations during routine stem cell divisions over the course of a lifetime. Researchers examined the relationship between the total number of stem cell divisions in the lifetime of a given tissue and the lifetime risk of cancer in that tissue. The table provides the data for 3131 types of cancers for which this information is known in the U.S. population. Each risk value reflects the proportion of individuals in the population who get this particular cancer over their lifetime.
?????? ????Cancer type | ???????? ????Lifetime risk | Stem cell divisions | Cancer type | ???????? ????Lifetime risk | ???? ???? ?????????Stem cell divisions |
---|---|---|---|---|---|
AM leukemiaAM leukemia | 0.00410000.0041000 | 129,900,000,000129,900,000,000 | Lung (smokers) | 0.08100000.0810000 | 9,272,000,0009,272,000,000 |
Basal cell | 0.30000000.3000000 | 3,550,000,000,0003,550,000,000,000 | Medulloblastoma | 0.00011000.0001100 | 272,000,000272,000,000 |
CL leukemia | 0.00520000.0052000 | 129,900,000,000129,900,000,000 | Melanoma | 0.02030000.0203000 | 763,800,000,000763,800,000,000 |
Colorectal | 0.04800000.0480000 | 1,168,000,000,0001,168,000,000,000 | Osteosarcoma | 0.00035000.0003500 | 29,260,00029,260,000 |
FAP colorectal | 1.00000001.0000000 | 1,168,000,000,0001,168,000,000,000 | Arms osteosarcoma | 0.00004000.0000400 | 4,550,0004,550,000 |
Lynch colorectal | 0.50000000.5000000 | 1,168,000,000,0001,168,000,000,000 | Head osteosarcoma | 0.00003020.0000302 | 6,020,0006,020,000 |
Duodenum | 0.00030000.0003000 | 7,796,000,0007,796,000,000 | Legs osteosarcoma | 0.00022000.0002200 | 11,130,00011,130,000 |
FAP duodenum | 0.03500000.0350000 | 7,796,000,0007,796,000,000 | Pelvis osteosarcoma | 0.00003000.0000300 | 3,150,0003,150,000 |
Esophageal | 0.00193800.0019380 | 1,203,000,0001,203,000,000 | Ovarian germ cell | 0.00041100.0004110 | 22,000,00022,000,000 |
Gallbladder | 0.00280000.0028000 | 78,400,00078,400,000 | Pancreatic ductal | 0.01358900.0135890 | 342,800,000,000342,800,000,000 |
Glioblastoma | 0.00219000.0021900 | 270,000,000270,000,000 | Pancreatic islet | 0.00019400.0001940 | 6,068,000,0006,068,000,000 |
Head and neck | 0.01380000.0138000 | 31,860,000,00031,860,000,000 | Small intestine | 0.00070000.0007000 | 292,200,000,000292,200,000,000 |
HPV head and neck | 0.07935000.0793500 | 31,860,000,00031,860,000,000 | Testicular | 0.00370000.0037000 | 3,348,000,0003,348,000,000 |
Hepatocellular | 0.00710000.0071000 | 270,900,000,000270,900,000,000 | Thyroid follicular | 0.01026000.0102600 | 585,000,000585,000,000 |
HCV hepatocellular | 0.07100000.0710000 | 270,900,000,000270,900,000,000 | Thyroid medullary | 0.00032400.0003240 | 58,500,00058,500,000 |
Lung (nonsmokers) | 0.00450000.0045000 | 9,272,000,0009,272,000,000 |
To access the complete data set, click the link for your preferred software format:
Excel Minitab JMP SPSS TI R Mac-TXT PC-TXT CSV CrunchIt!
(a) Using the software of your choice, make a scatterplot showing lifetime cancer risk as a function of total number of stem cell divisions.
Choose the best description of the form, direction and strength of the relationship. Would it be appropriate to compute the correlation ?r between these two variables?
a. The scatterplot is not linear and no apparent correlation is observed. Because the nonlinear nature of the plot it is not appropriate to compute the correlation ?r .
b. The scatterplot is not linear and no apparent correlation is observed. Because the nonlinear nature of the plot it is appropriate to compute the correlation ?r .
c. The scatterplot is linear so it is appropriate to compute the correlation ?r .
d. The scatterplot is linear so it is not appropriate to compute the correlation ?r .
(b) When a variable spreads over several orders of magnitude, transforming the data using a base‑1010 logarithm function helps focus on these differences of magnitude. What are the minimum and maximum for each variable? Now obtain the base‑1010 log of each value in the table. What are the minimums and maximums now?
(Enter your answers rounded to four decimal places, except for maximum for lifetime risk base 10maximum for lifetime risk base 10 , which should be rounded to five decimal places.
minimum for lifetime risk base 10minimum for lifetime risk base 10: __________
maximum for lifetime risk base 10maximum for lifetime risk base 10: __________
minimum for stem cell divisions base 10minimum for stem cell divisions base 10: _________
maximum for stem cell divisions base 10maximum for stem cell divisions base 10: _________
(c) Using the software of your choice, make a scatterplot showing the log of lifetime cancer risk as a function of the log of total number of stem cells divisions. Describe the form, direction, and strength of this log–log relationship.
How strong is the relationship between the log of cancer risk and the log of stem cell division?
a. The log–log relationship of Lifetime risk and Stem cell divisions shown is negative, linear and strong.
b. The log–log relationship of Lifetime risk and Stem cell divisions shown is negative, nonlinear and weak.
c. The log–log relationship of Lifetime risk and Stem cell divisions shown is positive, linear and strong.
d. The log–log relationship of Lifetime risk and Stem cell divisions shown is positive, linear and weak.
Use software to obtain the value of ?r for the log‑transformed data. (Enter your answer rounded to four decimal places.)
?= ________
a) Choose the best description of the form, direction and strength of the relationship. Would it be appropriate to compute the correlation ?r between these two variables?
The scatter plot is linear so it is appropriate to compute the correlation ?r .
Option - C
b) When a variable spreads over several orders of magnitude, transforming the data using a base‑1010 logarithm function helps focus on these differences of magnitude. What are the minimum and maximum for each variable? Now obtain the base‑1010 log of each value in the table. What are the minimums and maximums now?
minimum for lifetime risk base 10: 0.00003
maximum for lifetime risk base 10: 0.50000
minimum for stem cell divisions base 10: 3,150,000
maximum for stem cell divisions base 10: 3,550,000,000,000
c) How strong is the relationship between the log of cancer risk and the log of stem cell division?
The log–log relationship of Lifetime risk and Stem cell divisions shown is positive, linear and strong.
Option - C
Use software to obtain the value of ?r for the log‑transformed data.
r = 0.0342