In: Statistics and Probability
This week we are learning about probability distributions. If you were to explain what these are to a friend how would you explain them? What is important to know about them?
Probability Distribution Prerequisites
To understand probability distributions, it is important to understand variables. random variables, and some notation.
Generally, statisticians use a capital letter to represent a random variable and a lower-case letter, to represent one of its values. For example,
Probability Distributions
An example will make clear the relationship between random variables and probability distributions. Suppose you flip a coin two times. This simple statistical experiment can have four possible outcomes: HH, HT, TH, and TT. Now, let the variable X represent the number of Heads that result from this experiment. The variable X can take on the values 0, 1, or 2. In this example, X is a random variable; because its value is determined by the outcome of a statistical experiment.
A probability distribution is a table or an equation that links each outcome of a statistical experiment with its probability of occurrence. Consider the coin flip experiment described above. The table below, which associates each outcome with its probability, is an example of a probability distribution.
Number of heads | Probability |
---|---|
0 | 0.25 |
1 | 0.50 |
2 | 0.25 |
The above table represents the probability distribution of the random variable X.
A probability distribution is defined in terms of an underlying sample space, which is the set of all possible outcomes of the random phenomenon being observed.
Probability distributions are generally divided into two classes. A discrete probability distribution (applicable to the scenarios where the set of possible outcomes is discrete, such as a coin toss or a roll of dice) can be encoded by a discrete list of the probabilities of the outcomes, known as a probability mass function. On the other hand, a continuous probability distribution (applicable to the scenarios where the set of possible outcomes can take on values in a continuous range (e.g. real numbers), such as the temperature on a given day) is typically described by probability density functions (with the probability of any individual outcome actually being 0). The normal distribution is a commonly encountered continuous probability distribution