In: Statistics and Probability
A wholesale distributor operating in different regions of Portugal has information on annual spending of several items in their stores across different regions and channels. The data (Wholesale Customer.csv) consists of 440 large retailers’ annual spending on 6 different varieties of products in 3 different regions (Lisbon, Oporto, Other) and across different sales channel (Hotel/Restaurant/Café HoReCa, Retail).
1.1. Use methods of
descriptive statistics to summarize data.
Which Region and which Channel seems to spend more?
Which Region and which Channel seems to spend less?
1.2. There are 6
different varieties of items are considered.
Do all varieties show similar behaviour across Region and
Channel?
1.3. On the basis of
the descriptive measure of variability, which item shows the most
inconsistent behaviour?
Which items shows the least inconsistent behaviour?
1.4. Are there any outliers in the data?
1.5. On the basis of this report, what are the recommendations?
How do I attach file, unable to paste data..also send me python commands for this answer
1.1
In Channel 1 Average Highest Spending in Fresh items and Lowest Spending in Detergents_Paper.
In Channel 2 Average Highest Spending in Grocery items and Lowest Spending in Frozen items.
In Region 1 Average Highest Spending in Fresh and Lowest in Delicassen items.
In Region 2 Average Highest Spending in Fresh and Lowest in Delicassen items.
In Region 3 Average Highest Spending in Fresh and Lowest in Delicassen items.
1.2
See Behaviour in all items across Channel and Region use Bar Plot. Here we see that they are different in Channel and Region.
1.3
Fresh item have highest Standard deviation So that is Inconsistent.
Delicassen item have smallest Standard deviation, So that is consistent.
1.4
Use Boxplot to see Outliers:
The black point is the outliers in boxplot graph.