In: Statistics and Probability
Explain to me why and in what circumstances it would be necessary to use a proportion instead of a mean and vice versa
A sample mean is the average value of a sample while the sample proportion is amount of the sample that shares a commonality relative to its whole. They can both be used as estimates of the population they are sampled from they just tell us slightly different information.
The average may tell us roughly how many people agree with a topic, while the proportion takes a value from 0 to 1 saying exactly how many people from the sample agree when compared to the whole sample of agreers and disagreers.
Suppose you’re a salesperson working at a car dealership. I’m your manager and I want to develop metrics for measuring your performance. Two metrics I may be interested in are the average cost of the vehicles you sell and the average number of people you interact with who you are able to sell a car to. In both cases, the a common variable is the number of vehicles you sold, but the numbers I’m measuring are quite different.
Let’s say last month you sold 12 cars for an average price of $32,000 per car. In that month you interacted with 92 customers, meaning your conversion or success rate was about 13.0%. These are naïve metrics for a real organization, but they convey very different pieces of information. And in both cases they are a first step for me to compare you performance from one month to the next against yourself and your peers.
(In fact, after recording the data for two months, I could average the number of cars you sold each month.)
Formally, if you repeatedly measure a continuous variable XX and you want to estimate the mean effect of that variable, you’d compute its average as
x¯¯¯=1n∑i=1nXi.x¯=1n∑i=1nXi.
Suppose instead you’re repeatedly measuring a dichotomous variable, like coin flips or approval of a presidential candidate (or vehicles sold to prospective customers). Then you’d use the sample proportion to estimate the proportion of the event’s occurrence as
p=m successesn trials,0⩽m⩽n,n>0.p=m successesn trials,0⩽m⩽n,n>0.
Reiterating, a sample mean is used to get a sense of the average value in a range of possible values that are generally continuous or quantitative. Proportions make more sense when dealing with a dichotomous or logical variable