Question

In: Computer Science

Lesson Assignment There's not much content to this lesson other than to create the table of...

Lesson Assignment

There's not much content to this lesson other than to create the table of words and counts (which will be a list of tuples). The words are already parsed out for you (same as the previous lesson).

Build the following three functions:

def clean(words):

  • normalizes the words so that letter case is ignored
  • returns an array of 'cleaned' words

def build_table(words):

  • builds a dictionary of counts
  • returns a Python dictionary or collections.Counter type

def top_n(table, n):

  • returns the n most frequent words(keys) in table
  • the return type is an array of tuples
  • the tuple's first value is the word; the second value is the count

Notes:

the function top_n does not have to worry about the order of items for those words that have the same count. This feature is called stable sorting -- where the items after the sort will always be in the same order (more discussion in the extra credit). You can use collections.Counter to help you with this lesson, but it will NOT return a stable order.

Be sure to test your pipeline on multiple texts. Each 'run' should not affect others:

v1 = list(pipeline(['a','b','c'], 5))

v2 = list(pipeline(['a','b','c'], 5))

print(v1 == v2)

Solutions

Expert Solution

Below is a screen shot of the python program to check indentation. Comments are given on every line explaining the code.

Below is the output of the program:


Below is the code to copy:
#CODE STARTS HERE----------------
def clean(words): #Converts each word to lowercase
   return [x.lower() for x in words]

def build_table(words): #Use dictionary to count words
   count = dict()
   for i in words: #loop through every word
      #Increment counter by 1 if the word is already present
      #Or add the new word to the dict
      count[i] = count.get(i, 0) + 1
   return count

def top_n(table, n): #Sorts the table dict and returns top 'n' words
   list_of_tup = [] #Used to store the list of tuples
   counter = 0 #Counter to filter top 'n' words
   #Sort dict using sorted() and loop through its key,value pair
   for k, v in sorted(table.items(), key=lambda item: item[1],reverse=True):
      if counter>=n:
         break
      list_of_tup.append((k,v)) #Append to the list of tuples
      counter+=1
   return list_of_tup

def pipeline(words,n): #Custom pipeline created to run the test case given in question
   # Calling all 3 functions
   cleaned = clean(words)
   counter_dict = build_table(cleaned)
   top_tup = top_n(counter_dict,n)
   return top_tup #Returns the list of top tuples

v1 = list(pipeline(['a','b','c','A'],2)) #I have added "A" to the list for testing
v2 = list(pipeline(['a','b','c'],2))
print(v1) #Prints v1
print(v2) #Prints v2
print(v1 == v2)
#CODE ENDS HERE------------------

Related Solutions

This project is much easier if you create an algorithm. If you understand the content of...
This project is much easier if you create an algorithm. If you understand the content of the Concurrency Basics Tutorial, this Project will be easy. If not, 40 hours of working on it won't help. I suggest reading the instructions slowly and carefully before reading the tutorial so that you will notice what is needed as you read. Then read them again, slowly and carefully. Using the concepts from the Concurrency Basics Tutorial I provided in Modules, write a program...
TAX 502 - Lesson Assignment #6: Penalty Taxes on Undistributed Corporate Income, Dividends and Other Nonliquidating...
TAX 502 - Lesson Assignment #6: Penalty Taxes on Undistributed Corporate Income, Dividends and Other Nonliquidating Distributions Reading Text: Study Chapters 7 and 8 of the Bittker & Eustice text. Assignments The following Assignments should be completed and submitted to the course faculty via the learning platform for evaluation and grading. Submit your responses to these questions in one WORD document. List the question first, and then your response. Copy the question, and then provide your answer on all of...
For this assignment you have to create tables as reviewed in class Manufacturer table, Product table...
For this assignment you have to create tables as reviewed in class Manufacturer table, Product table and Stock table. The code is in Week 7 PPT lecture. You are to add update as needed to make the tables the same as below and match Product, Manufacturer and Stock as below. For this assignment you will be working with these same tables.   Please complete the following tasks and submit in word file or notepad. 1. First insert tuples into two of...
For this assignment you have to create tables as reviewed in class Manufacturer table, Product table...
For this assignment you have to create tables as reviewed in class Manufacturer table, Product table and Stock table. The code is in Week 7 PPT lecture. You are to add update as needed to make the tables the same as below and match Product, Manufacturer and Stock as below. For this assignment you will be working with these same tables.   Please complete the following tasks and submit in word file or notepad. 1. First insert tuples into two of...
Create a lesson plan teaching how to throw/pass a football using a template or a lesson...
Create a lesson plan teaching how to throw/pass a football using a template or a lesson plan you're most comfortable using. Remember to provide details in every part of the lesson plan.
Graded Homework Assignment 4 Unit 3, Lessons 1-3 Lesson 1 - Ethics (a light lesson –...
Graded Homework Assignment 4 Unit 3, Lessons 1-3 Lesson 1 - Ethics (a light lesson – know the definitions and principles!) 1. The most complex issues of data ethics can arise when we collect data from A. A census B. Randomized experiments on people C. Observational studies D. Surveys Stat 1350 - Elementary Statistics 2. Some basic standards of data ethics that must be obeyed by any study that gathers information from human subjects are to: A. Have an institutional...
Assignment Details: Perform the following tasks: Complete the reading assignment and the interactive lesson before attempting...
Assignment Details: Perform the following tasks: Complete the reading assignment and the interactive lesson before attempting this assignment. Select a recent news article about a life event of an individual. It can be health related, accident related, educational, or even achievement-oriented. It will be one "slice in the lifespan of that person." For example, you might select a story of someone who has achieved a major goal in life after experiencing a debilitating accident. An example would be Nick Vujicic....
An article included the following statement: "Few people believe there's much reality in reality TV: a...
An article included the following statement: "Few people believe there's much reality in reality TV: a total of 78% said the shows are either 'totally made up' or 'mostly distorted.'" This statement was based on a survey of 1006 randomly selected adults. Compute a bound on the error (based on 95% confidence) of estimation for the reported proportion of 0.78. (Round your answer to three decimal places.) Interpret the bound. (Round your answers to one decimal place.) We are %...
There are two main theories to explain why Angiosperm Diversity is much greater than other groups....
There are two main theories to explain why Angiosperm Diversity is much greater than other groups. Describe the two theories and proved an example of each.
Create a list of 3 businesses (other than the example below) that a hospital, physician's office,...
Create a list of 3 businesses (other than the example below) that a hospital, physician's office, nursing home, or other healthcare organization would contract with that would be considered business associates under HIPAA. For each business associate, please indicate at least one issue that would need special attention and how would the healthcare organization would ensure that PHI was being safeguarded? Example: A hospital may choose to contract with a technology recycling company to take care of the disposal of...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT