Question

In: Computer Science

Use pd.crosstab() to count the number of regions of each cover type there are for each...

Use pd.crosstab() to count the number of regions of each cover type there are for each of the 40 soil types. Pass this function the Cover_Type column as its first argument and the Soil_Type column as the second argument. Store the results in a DataFrame named ct_by_st and then display this DataFrame.

soil = np.unique(fc['Soil_Type'])

palette = ['orchid', 'lightcoral', 'orange', 'gold', 'lightgreen', 'deepskyblue', 'cornflowerblue']

Perform the following steps in a single cell:

1. Start by converting the count information into proportions. Create a DataFrame named ct_by_st_props by dividing ct_by_st by the column sums of ct_by_st. The column sums can be calculated using np.sum() or the DataFrame sum() method.

2. We will be creating a stacked bar chart, so we need to know where the bottom of each bar should be located. This can be calculated as follow: bb = np.cumsum(ct_by_st_props) - ct_by_st_props

3. Create a Matplotlib figure, setting the figure size to [8, 4].

4. Loop over the rows of ct_by_st_props. Each time this loop executes, add a bar chart to the figure according to the following specifications.

• The height of the bars should be determined by the current row of ct_by_st_props.

• The bottom position of each bar should be determined by the current row of bb.

• Each bar should have a black border, and a fill color determined by the current value of palette.

• The label for the legend should be set to the value of Cover_Type associated with the current row.

5. Set the labels for the x and y axes to be "Soil_Type" and "Cover_Type". Set the title to be "Distribution of Cover Type by Soil Type". 6. Add a legend to the plot. Set the bbox_to_anchor parameter to place the legend to the right of the plot, near the top. 7. Display the figure using plt.show().

Elevation   Aspect   Slope   Hori Hydrology Vertical Hori Roadways   Hill_9am   Hill_Noon   Hill_3pm   Hori Points   Wilderness_Area   Soil_Type   Cover_Type
2596 51 3 258 0 510 221 232 148 6279 Rawah 29 5
2590 56 2 212 -6 390 220 235 151 6225 Rawah 29 5
2804 139 9 268 65    3180 234 238 135 6121 Rawah 12 2
2327 188 15 339 144 1256 220 250 159 1101 Cache la Poudre 6 4
2298 129 21 255 115 1326 249 222 90 999 Cache la Poudre 3 4
2289 133 21 234 106 1345 248 225 95 973 Cache la Poudre 3 4
2274 142 23 201 111 1383 246 227 96 924 Cache la Poudre 3 4

2850 359 12 30 4 1585 202 218 153 1187 Comanche Peak 31 5
2888 311 14 95 9 1774 180   229 188 1418 Comanche Peak 32 5
2903 0 5 134 19 1865 212 230 156 1463 Comanche Peak 32 5
2902 7 8 170 11 1892 211 225 151 1480 Comanche Peak 32 5

3598 20 15 342 61 1848 208 207 133 1673 Neota 40 7
3318 96 12 95 -5 1224 239 222 111 1411 Neota 38 7
3433 342 14 551 204 1044 189 217 166 1442 Neota 40 7
3218 49 18 0 0 1822 225 197 100 1673 Neota 23 2

Solutions

Expert Solution

ANSWER:


I have provided the properly commented  and indented code so you can easily copy the code as well as check for correct indentation.
I have provided the output image of the code so you can easily cross-check for the correct output of the code.
Have a nice and healthy day!!

CODE

# import important modules
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

fc = pd.read_csv("ct_by_st.csv")
# using pd.crosstab with input args as cover_type and soil type
ct_by_st = pd.crosstab(fc['Cover_Type'],fc['Soil_Type'])
# displaying dataframe
print("ct_by_st Dataframe is:")
print(ct_by_st)

palette = ['orchid', 'lightcoral', 'orange', 'gold', 'lightgreen', 'deepskyblue', 'cornflowerblue']


# 1. defining ct_by_st_props dataframe
ct_by_st_props = ct_by_st/ct_by_st.sum()

# 2. botton of each bar
bb = np.cumsum(ct_by_st_props) - ct_by_st_props

# 3. create Matplotlib figure, setting the figure size to [8, 4].
fig = plt.figure(figsize = (8,4))

# 4. loop over the rows of ct_by_st_props
for i in range(len(ct_by_st_props)):
    # fetching row with respect to index
    row_ct = ct_by_st_props.iloc[i].values
    row_bb = bb.iloc[i].values
    Cover_Type = ct_by_st_props.index[i]
    #
    plt.bar(list(range(len(row_ct))), row_ct, bottom=row_bb,label=palette[Cover_Type-1])

# 5. labeling plot
# setting xticks value
plt.xticks(list(range(len(ct_by_st_props.columns))), ct_by_st_props.columns)
# other labeling
plt.xlabel("Soil_Type")
plt.ylabel("Cover_Type")
plt.title("Distribution of Cover Type by Soil Type")

# 6. show legend
plt.legend(bbox_to_anchor=[1, 1],loc='upper right')

# 7. show plot
plt.show()

OUTPUT IMAGE


Related Solutions

Identify the appropriate purpose of use for an aircraftinsurance policy to cover each of the...
Identify the appropriate purpose of use for an aircraft insurance policy to cover each of the following operations:A private aircraft used for personal recreation and personal business travelAn aircraft used exclusively for flight trainingA corporate business jetAircraft used by an air cargo company, such as FedEx or UPSAircraft used by a full-service FBO offering flight training, aircraft rental, and charterAircraft engaged in weather research flights in extreme weather conditions such as fully-developed thunderstorms and high-altitude wind shear
I am trying to use the countif or countifs function to count the number of genres...
I am trying to use the countif or countifs function to count the number of genres used to describe a film and no matter what syntax I use, the result is 0. Can I use the countif(s) function to count columns? I want to count the number of movies that have 3 or more genre's associated with them. Here is a sample of the data I am working with: primarykey Movie Genre1 Genre2 Genre3 Genre4 Genre5 Genre6 Genre7 tt0499549 Avatar...
This dataset includes the number of work hours for each project, the function point count for...
This dataset includes the number of work hours for each project, the function point count for each project, and identifiers for operating system, data management system, and programming language utilized. Open the dataset pointworkload.csv in Excel. Create a new column that calculates the number of work hours per function point for each project. FunctionPointCount WorkHours OS DMS Language 1059 15000 1 5 1 234 1850 1 5 1 1533 13033 1 5 1 339 11742 1 2 1 205 283...
For each of the regions listed in the following table, use the midpoint method to identify...
4. Elastic, inelastic, and unit-elastic demand The following graph shows the demand for a good.For each of the regions listed in the following table, use the midpoint method to identify if the demand for this good is elastic, (approximately) unit elastic, or inelastic.RegionElasticInelasticUnit ElasticBetween W and XBetween Y and ZBetween X and YTrue or False: The value of the price elasticity of demand is equal to the slope of the demand curve.TrueFalse
Suppose you have Avogadro's number of mini marshmallows and use them to cover the state of...
Suppose you have Avogadro's number of mini marshmallows and use them to cover the state of Michigan which has a land area of 5.680 × 104 mi2. Each mini marshmallow has a diameter of 0.635 cm and a height of 2.54 cm. Assuming the marshmallows are packed together so there is no space between them, to what height above the surface, in kilometers, will the mini marshmallows extend?
Use StatCrunch to determine the count and percentage of observations falling in each of these intervals...
Use StatCrunch to determine the count and percentage of observations falling in each of these intervals by following the instructions listed below or using another appropriate counting method. Properly label and list these counts and percentages in your document. Start in the “Body Temp” data set. Go to Data  Row Selection  Interactive Tools. In the slider selectors box, click the variable Body Temp into the variable box. Then Click compute. I got this answer: 68% of the data...
: EACH QUESTION SHOULD BE 1 TO 3 PAGES IN LENGTH TO COVER THE MATERIALS. USE...
: EACH QUESTION SHOULD BE 1 TO 3 PAGES IN LENGTH TO COVER THE MATERIALS. USE CITATIONS FROM YOUR RESEARCH AND THE CLASS MATERIALS. PROVIDE WORKS CITED PAGE. Question: what is meant by the term Jus Cogens. Why is it important to international law and what effect does it have on the making of international law.
Suppose you have Avogadro\'s number of mini marshmallows and use them to cover the state of...
Suppose you have Avogadro\'s number of mini marshmallows and use them to cover the state of Oregon which has a land area of 9.600 × 104 mi2. Each mini marshmallow has a diameter of 0.635 cm and a height of 2.54 cm. Assuming the marshmallows are packed together so there is no space between them, to what height above the surface, in kilometers, will the mini marshmallows extend?
For each of the regions, use the midpoint method to identify whether the supply of this good is elastic or inelastic.
For each of the regions, use the midpoint method to identify whether the supply of this good is elastic or inelastic.  True or False: For high levels of quantity supplied where firms have reached near maximum capacity, supply becomes less elastic because firms may need to invest in additional capital in order to increase production further. 
2.c++ if and loop statement Write a program that will count the number of even number...
2.c++ if and loop statement Write a program that will count the number of even number and odd numbers between two inputted numbers. Display the numbers and compute the sum and average of all the even numbers and the sum and average all the odd numbers. Sample outputs: Enter starting number:3 Enter starting number:4 Enter ending number:10 Enter ending number:10 odd numbers Even number 3 4 5 6 7 8 9 10 number of even numbers=4 number of even numbers=4...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT