In: Statistics and Probability
Frequency Plots:
a) How do you draw a frequency plot given a data set?
b) What information about a data set do you get from a frequency
plot of that set?
c) What is the difference between theoretical and experimental
frequency plots?
i. When is it a safe bet to say that experimental frequency plots
will begin to look like theoretical frequency plots?
d) How does the frequency plot relate to the cumulative
distribution plot?
i. How do you draw a frequency plot using only a cumulative
distribution plot?
a). A frequency plot -
Let us first of all try to understand as ot what exactly is a frequency plot.?
A frequency plot is a graphical representation method for summarizing the distributional information of a given variable.
To make a frequency plot we follow the following steps.
1. Arrange the data array into ascending order.
2. Now we shall distribute these data points into various data intervals covering the entire distribution. To do so we look at the lowest and the largest value in the data set . Now we shall define the lower limit of the first class interval (pre defined range of data) to be such that it is equal to or just less than the lowest data point of the array. After the lower class limit is defined, we will choose a class range of sufficient range such that the entire distribution is spread across 6-7 intervals.
Upper class limit = lower class limit+period
Thus we now know the class limits (upper and lower) and class intervals, we can make a table of all the possible class intervals.
3). Now count the total number of values from the data sets that lie in the defined class intervals. This is called as frequency of the relevant class interval. Thus we will now have a frequency distribution table.
4). Now plot this distribution in form of a bar graph (without gaps between bars -known as histogram) and connect the mid points of all the bars to make a contineous curve/line.
This is called as frequency plot.
The frequency plot and the histogram have the same information except the frequency plot has lines connecting the frequency values whereas the histogram has bars at the frequency values.
In summary we say that The data set is divided into equal sized intervals (or bins). The number of occurrences of the data points is calculated for each bin. The frequency plot then consists of:
Vertical axis | = | frequencies or relative frequencies; |
Horizontal axis | = | data set (i.e., the mid-point of each interval). |
There can be 4 types of frequency plots:
b). A frequency plot is a very significant plot and it gives various information about the data set given ,such as
1. The nature and profile of the plot lets us know if the given distribution of the data set follows a normal distribution or not ?
2. If it follows a normal distribution ,then what can be its skewness (shift of mean) ,standard deviation ,standard error etc.
3. It tells us about the outliers from the data set.
4. It gives us an information about the percentile distribution of the data set
5. It provides the information of the number of occurrences (frequency) of distinct values distributed within a given period of time or interval.
6. It is also an estimate of the probability distribution of a continuous variable .
7. We can analyse the number of data sets belonging to a certain proportion of the data set.
8. It tells us about of the density distribution of the data from the data set.
c). Difference between theoretical and experimental frequncy plot
A experimental frequency plot is a frequency plot wherein the data is obtained after an actual experimental design whereas a theoretical frequency plot is a frequency plot in which the data obtained is the theoretical output based no assumed behaviour (mathematical model). The theoretical value is what we expect to happen, but it isn't always what actually happens.
For example , a theoretical frequency data point for chance of occurrance of 6 in an unbiased die is 1/6 while the experimental output for the same can be any actual observed value from 1/6 to 6/6 .
In general, the experimental frequency of an event tends to get closer to the theoretical probability of the event as we perform more trials. The theretical frequency converges towards the expected value/mean value (theretical value) as we increase on the number of data points.
Usually , if we take more than 30 data points or more then there will be very close relationship between the two.
d). Before understanding hte diffence between a frequency plot and a cumulative freuqnecy plot, we need ot firstly understand as ot what exactly is a cumulative freuqnecy plot.
A cumulative frequency plot is a graphical method of displaying the cumulative information graphically. It shows the number, percentage, or proportion of observations that are less than or equal to particular values. A cumulative frequency plot is a graphical represnetaiton of hte cumulative frequency.
But what is cumulative frequency?
Cumulative frequency is the running total of all frequencies upto the given point. It is the sum of all the previous frequencies up to the current point. It is easily understandable through a Cumulative Frequency Table.
Now to explain the difference between a frequency plot and a cumulative frequency plot let us take an example as follows.
Let us have the following data set
25,22,36,38,36,38,46,45,48,46,55,55,52,58,55,68,67,61,72,91 |
Now refer to part a of our solution .wE have explianed the steps to create the frequency table.
The frequency table and its histogram looks like this
Joining the mid points of hte bar of the histogram, we will get the frequency plot.
From the frequency table we can now plot the cummulative frequency plot.
Class | Frequency | cumu. Frequency | |
20-35 | 2 | 2 | =2+0 |
36-51 | 8 | 10 | =2+8 |
52-67 | 7 | 17 | =10+7 |
68-83 | 2 | 19 | =17+2 |
84-99 | 1 | 20 | =19+1 |
and the cumulative plot will be