In: Statistics and Probability
You are working on a research team studying body size of Erissinettes, a newly discovered alien race on the exoplanet Eris. The weirdly adorable Erissinettes have antennae that help the organisms interact with their environment and agile tails used to trap unsuspecting humans for drippy cuddles. Due to their novelty and pleasing resemblance to earth canines, interest in Erissinette biology is high and research dollars are flowing. Your job is to catalog the complex traits of antennae length and tail length and see if there is a correlation. The hope is to selectively breed smaller Erissinettes for transport and sale on Earth. This plan ignores the research suggesting Erissinette brain biology is much more similar to Earth felines; making them extremely intelligent organisms with egomania and manipulative tendencies.
Alien ID |
Alien antennae length (cm) |
Alien tail length (cm) |
A1 | 30 | 55 |
A2 | 23 | 45 |
A3 | 32 | 30 |
A4 | 13 | 33 |
A5 | 26 | 45 |
A6 | 36 | 57 |
A7 | 46 | 62 |
A8 | 42 | 44 |
A9 | 36 | 32 |
A10 | 53 | 49 |
A11 | 75 | 62 |
A12 | 16 | 29 |
A13 | 60 | 43 |
A14 | 22 | 45 |
A15 | 41 | 53 |
A16 | 15 | 37 |
A17 | 25 | 44 |
A18 | 26 | 31 |
A19 | 7 | 6 |
A20 | 5 | 23 |
What are the average, median, mode, variance, and standard deviation of antennae length and tail length? Graph antennae length by tail length in a scatterplot. Does there appear to be a correlation between the two traits? Is the correlation positive or negative? Can you quantify the correlation?
Figure how to visually represent the mathematical value of the correlation on your chart. (See lung function example in lecture for an idea of what you could produce). Write step-by-step directions on how to do this in excel. There may be more than one method and type of correlation that would be an acceptable answer to this question. Fully explain the mathematical basis of what you did and how you did it.
Descriptive statistics
In Excel we put the data first in two columns and then using different formula we get the required measures.
We use AVERAGE, MEDIAN, MODE, VAR, STDEV functions respectively to get the following output:
Next we go to Insert option and select the Scatter plot option to get the following output:
From the plot we can see that with increase in antennae length, there is increase in tail length too. So there exists a positive correlation.
To quantify it we use CORREL function in Excel.
We get the Correlation coefficient as 0.694.
Including a trend line in the plot we can visualise the correlation.
Clearly the slope of the trend line indicates that there is a positive correlation between the two variables.