Question

In: Computer Science

We ran across this one in the OECD healthcare data. The country names had numbers appended,...

We ran across this one in the OECD healthcare data. The country names had numbers appended, which served as footnotes in the original spreadsheet but looked dumb when we used them as index labels. The question is how to eliminate them. A short version of the country names is

`names = ['Australia 1', 'Canada 2', 'Chile 3', 'United States 1']`

Do each of these in a separate code cell:

1. Set `us = names[-1]` and call the `rsplit()` method on us. What do you get?
2. Consult the documentation for `rsplit` to split `us` into two pieces, the country name and the number 1. How would you extract just the country name?
3. Use a loop to strip the numbers from all of the elements of `names`.
4. Use a list comprehension to strip the numbers from all of the elements of `names`.

Solutions

Expert Solution

Python Code:

#Defining dataset
names = ['Australia 1', 'Canada 2', 'Chile 3', 'United States 1'];

1. us = names[-1]

This will return: 'United States 1'

and if we call rsplit() on us we get : ['United', 'States', '1']

2. If we go through the documentation rsplit takes two arguments the delimiter and number of strings we want to get.

rsplit(sep=None, maxsplit=-1)

In order to get the country and number we will have to set sep=" " and maxsplit=1 i.e. us.rsplit(" ",1) and this will return "['United States', '1']"

To extract just the country name using us.rsplit(" ",1)[0].

3. numbers = []
for country in names:
print()
numbers.append(country.rsplit(" ",1)[1])

output: ['1', '2', '3', '1']

4. [country.rsplit(" ",1)[1] for country in names]

Complete code:

#Defining dataset
names = ['Australia 1', 'Canada 2', 'Chile 3', 'United States 1'];

#1.Set `us = names[-1]` and call the `rsplit()` method on us. What do you get?
us = names[-1]
us.rsplit()

#2. Consult the documentation for `rsplit` to split `us` into two pieces, the country name and the number 1.
#How would you extract just the country name?

#to split `us` into two pieces
us.rsplit(" ",1)

#extract just the country name
us.rsplit(" ",1)[0]

#3. Use a loop to strip the numbers from all of the elements of `names`.
numbers = []
for country in names:
print()
numbers.append(country.rsplit(" ",1)[1])

#4. Use a list comprehension to strip the numbers from all of the elements of `names`.
numbers_LC = [country.rsplit(" ",1)[1] for country in names]


Related Solutions

Suppose data collected by observers at randomly selected intersections across the country revealed that in a...
Suppose data collected by observers at randomly selected intersections across the country revealed that in a sample of 250 ​drivers, 160 were using their cell phone. a. Give a point estimate of​ p, the true driver cell phone use rate​ (that is, the true proportion of drivers who are using their cell phone while​ driving). b. Compute a 95​% confidence interval for p. c. Give a practical interpretation of the​ interval, part b. a. A point estimate for p is...
Here are the projected numbers (in thousands) of earned degrees in a certain country during one...
Here are the projected numbers (in thousands) of earned degrees in a certain country during one academic year, classified by level and by the sex of the degree recipient: Bachelor's Master's Professional Doctorate Female 932 403 52 28 Male 663 270 42 28 Use these data to answer the following questions. (a) What is the probability that a randomly chosen degree recipient is a man? (Round your answer to four decimal places.) (b) What is the conditional probability that the...
In the country of imagination, we have the following data for the GDP calculation (2015 is...
In the country of imagination, we have the following data for the GDP calculation (2015 is the base year): year Price of pizza Quantity of pizza Price of cheese Quantity of cheese 2015 $10 250 $6 200 2016 $12 250 $9 250 Please compute the nominal GDP, real GDP and GDP deflator for 2016. what is the percentage change in nominal GDP between 2015 and 2016? What is the percentage change of the overall price (GDP deflator) between 2015 and...
Insurance. As we learned this term, healthcare insurance is one example of an entire class of...
Insurance. As we learned this term, healthcare insurance is one example of an entire class of insurance products that is provided by the government. The government tends to underwrite insurance programs when they are at a high risk of failure, for whatever the reason. Carefully explain four reasons why the government underwrites healthcare insurance in the U.S. In your answer, use as many examples from the US healthcare industry as you can.
Statistics Analysis question! BUS310 In a one-way ANOVA test, we conclude there are differences across population...
Statistics Analysis question! BUS310 In a one-way ANOVA test, we conclude there are differences across population means if the a. between variation is small relative to the within variation b.  between variation is large relative to the within variation c.     between variation and within variation are equal Suppose a researcher sets up a design in which there are five different means and a total of 32 measurements in the study. For alpha = .01, the critical table F value is ____....
37. SHORT ESSAY – If you had to move from your home country to ONE other...
37. SHORT ESSAY – If you had to move from your home country to ONE other country SOLELY because of its benefits practices, to which ONE country would you move and why? (Use space provided.)
Can I please get an understanding to benchmarking in healthcare, thanks We are always comparing one...
Can I please get an understanding to benchmarking in healthcare, thanks We are always comparing one thing to another to see if it is somehow better. Whether we know it or not, this practice has a name. It is called benchmarking and that is the topic of this week’s discussion. Discuss what the actual term means. What is the purpose of this technique? Discuss what benchmarking isn’t? Discuss the 5 phases of benchmarking. Do you feel this is a useful...
When calculating variance, we square the difference between each data point and the mean in a set of numbers. Why do we do this?
When calculating variance, we square the difference between each data point and the mean in a set of numbers. Why do we do this?A. The deviation scores will sum to zero and cancel each other out if they are not squared first.B. We need to even out the numbers to make them easier to handle.C.Squaring the difference between a data point and the mean is the way to calculate deviance.D. Larger numbers are easier to calculate than smaller numbers.
We have been using the same set of data (Data Set One) in the notes to...
We have been using the same set of data (Data Set One) in the notes to illustrate production and costs. I have provided Data Set One in both tables below. When costs were calculated in the notes, fixed costs were $200. By using the term fixed costs economists are only referring to the fact that a firm must pay this expense no matter how much output it produces or sells. An example of a fixed cost could be the rent...
We have been using the same set of data (Data Set One) in the notes to...
We have been using the same set of data (Data Set One) in the notes to illustrate production and costs. I have provided Data Set One in both tables below. When costs were calculated in the notes, fixed costs were $200. By using the term fixed costs economists are only referring to the fact that a firm must pay this expense no matter how much output it produces or sells. An example of a fixed cost could be the rent...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT