In: Statistics and Probability
You have just started at a company and they present you with historical data for the time in minutes that it takes to prepare a solar panel. The company has 3 locations and they give you the following sample data:
Minutes to Prepare A Solar Panel |
||
Site 1 |
Site 2 |
Site 3 |
119.5 |
120.3 |
128.1 |
118.0 |
100.0 |
126.6 |
114.4 |
107.2 |
123.5 |
118.6 |
118.4 |
125.5 |
126.9 |
111.6 |
127.3 |
122.0 |
113.1 |
118.3 |
116.1 |
102.9 |
125.6 |
136.5 |
103.2 |
128.2 |
119.5 |
100.3 |
126.1 |
124.7 |
99.3 |
120.6 |
121.7 |
89.3 |
123.3 |
110.2 |
100.9 |
137.4 |
119.0 |
116.7 |
122.7 |
123.7 |
112.8 |
127.0 |
120.5 |
101.8 |
123.3 |
122.5 |
106.8 |
122.9 |
124.9 |
107.7 |
114.3 |
121.9 |
111.8 |
121.3 |
120.7 |
103.5 |
127.1 |
122.4 |
111.8 |
127.6 |
118.6 |
103.6 |
127.7 |
120.5 |
104.8 |
130.6 |
111.6 |
126.0 |
119.3 |
120.6 |
114.5 |
122.2 |
116.2 |
106.3 |
123.8 |
123.5 |
105.4 |
121.7 |
126.4 |
108.6 |
113.6 |
118.5 |
114.7 |
123.5 |
116.0 |
130.4 |
131.9 |
122.1 |
104.5 |
123.6 |
123.5 |
108.3 |
130.4 |
115.6 |
105.6 |
127.0 |
124.8 |
114.2 |
123.1 |
119.3 |
98.3 |
115.9 |
113.1 |
100.0 |
121.9 |
116.8 |
117.9 |
123.9 |
123.0 |
116.4 |
121.7 |
115.8 |
113.2 |
131.6 |
119.3 |
118.0 |
124.8 |
119.2 |
108.1 |
123.9 |
They are interested in characterizing the data at these three locations. The QC manager stated that their goal is 120 min or less because then they could (ideally) make 4 per 8 hour day.
Step 1 : Put the data in excel and find the mean and standard deviation using the formula as shown.
Also find the max and min values for each dataset.
Step 2 : Creating histogram for each dataset - Site 1
Looking at the min for the dataset is 110 and max value is 136
Using these values we shall create the bins for the plotting the histogram in excel.
We create the bin from 105 and increase by 3 each step to get the following bin.
Next go to Data - > Data analysis -> Histogram
Input the values as shown.
The output will be as follows.
We see that there is one outlier in the data which is the
maximum value.
The data is normally distributed which is clear from the bell
shaped curve.
The outlier is also indicated by the boxplot given below
Histogram for Site 2
Following a similar step we find the histogram.
From the histogram, we see that it not completely normally distributed but can be considered to normally distributed
The box plot does not show outlier int the data but show fairly symmetric distribution of data with a huge range.
Histogram for Site 3
Following a similar step, we find the histogram.
From the histogram, we see that the data is normally distributed
but from the box plot, we find that there are outlier values on
both extremes of the data.