Question

In: Computer Science

Transcribing Anonymous SEC Tips Java or Python    * The function is expected to return a STRING_ARRAY....

Transcribing Anonymous SEC Tips

Java or Python

   * The function is expected to return a STRING_ARRAY.

     * The function accepts following parameters:

     *  1. STRING_ARRAY inputNames

     *  2. STRING_ARRAY secRecords

     */

Problem Statement

Introduction

Imagine you are helping the Security Exchange Commission (SEC) respond to anonymous tips. One of the biggest problems the team faces is handling the transcription of the companies reported by the callers. You've noticed that sometimes the company name is misheard by the person taking the call, sometimes it is simply mistyped, and sometimes both. These problems make it more difficult to search the SEC records to identify the company.

You have access to the list of transcribed company names and the database of SEC records. We need a way to effectively translate company names based on their transcriptions so we can narrow our search results to the one company we are interested in.

Input

You will receive a string array representing the list of transcribed company names.

Each string in the array takes the following form:

  • The string will contain a company name comprising a set of words separated by spaces.
  • You can assume company names will only use alphabetic characters -- no numbers, no punctuation.

You will also receive a string array representing the database of SEC records.

  • Each string will be of the form ; (company name string and company EIN string separated by a semicolon). An EIN is a federal tax identification number used to uniquely identify each company.
  • As above, the company name half of the string will contain a set of words separated by spaces.
  • The EIN half of the string comprises 2 integers, a dash, and 7 more integers, in that order. Example, "12-3456789".
  • There will be no semicolons anywhere in the company name or EIN strings.

You may also make the following assumptions about the structure:

  • There will be at most 1000 companies in the SEC database.
  • There will be at most 50 company names in the input.
  • No company names or EINs will be repeated in either the input or the database.

Output

For each transcribed company name in the input string array, you want to match that to a company name (first part of a string) in the SEC database. The second part of the string in the SEC database will represent the company's EIN. Your output should also be a string array, this time representing the EINs mapped to the names in the input string array. You may assume that every input name will match a name in the SEC records.

Responding to Calls

The Basics

Let's start with the first step: making sure that if the name is transcribed perfectly, we match that company's record in the database right away. This will give you an idea of how to match company names in our system and what the output array should be. This will also show you how the input is structured if you desire to make your own custom inputs. The input comes in the form of two string arrays, where the first line represents the length of the array. An example is below.

Input

3

Pear Computers

Construct An Ursus

Planetary Technologies

3

Pear Computers;54-1264938

Construct An Ursus;58-1481332

Planetary Technologies;19-3563561

Output

["54-1264938", "58-1481332", "19-3563561"]

Your code should pass test cases 0, 1, and 2 after solving this step.

Misspellings

The second thing we want to look for are basic misspellings due to the transcriber hearing the company name correctly but missing a keystroke or pressing the wrong key instead. Think "Harveys Steakhouse" turns into "Harfeys Sreakhouse" or "Sugar and Sugar" turns into "Sugra and Sugar". In the first example, the transcriber missed the "v" key and hit "f" instead, and missed "t" and hit "r" instead. In the second, the transcriber accidentally typed "r" before "a". You should pass test cases 3 through 8 after solving this problem. Hint: looking up the phrase "string edit distance" in a search engine should be of some help to you here.

Input

3

Pewar Computers

Consuct A Ursuus

Planteray Techniligies

3

Pear Computers;54-1264938

Construct An Ursus;58-1481332

Planetary Technologies;19-3563561

Output

["54-1264938", "58-1481332", "19-3563561"]

Metaphones

The last and trickiest instance of transcription comes in the form of arbitrary misspellings resulting from the transcriber either hearing the name correctly and using a different spelling than the one in our database, or mishearing the name in some form. Think "Ashley Antiques" vs. "Ashlee Antiques" vs. "Ashleigh Antiques" or "Rate My Reading" turns into "Great My Treating". This is a purposefully very open-ended and tricky problem, and you are not expected to get all cases. One example is viewable and most are purposefully hidden - try to be creative with your solution, as there are multiple ways you could solve this piece! Test cases 9 through 16 are the ones that relate to this part of the problem; as before, an example is below.

Input

3

Pare Computers

Conduct An Ersis

Palintary Technawlogies

3

Pear Computers;54-1264938

Construct An Ursus;58-1481332

Planetary Technologies;19-3563561

Output

["54-1264938", "58-1481332", "19-3563561"]

Solutions

Expert Solution

##IF YOU ARE SATISFIED WITH THE ANSWER, KINDLY LEAVE A LIKE, OR IF YOU THINK THERE IS SOME ERROR, LEAVE A COMMENT

CODE: PYTHON

from jellyfish import soundex
import editdistance
def my_func(inputNames,secRecords):
result = []
for i in inputNames:
for key in secRecords.keys():
if i == key or editdistance.eval(i,key) <= 2:
result.append(secRecords[key])
break
elif editdistance.eval(soundex(i),soundex(key)) <= 2:
result.append(secRecords[key])
break
return result


inputNames = []
for _ in range(int(input())):
inputNames.append(input())
secRecords = {}
for _ in range(int(input())):
value = input().split(";")
secRecords[value[0]] = value[-1]
print(my_fun(inputNames,secRecords))

OUTPUT:

Misspellings:

Metaphones:


Related Solutions

PYTHON PLS 1) Create a function search_by_pos. This function only has one return statement. This function...
PYTHON PLS 1) Create a function search_by_pos. This function only has one return statement. This function returns a set statement that finds out the same position and same or higher skill number. This function searches the dictionary and returns the same position and same or higher skill level. The function output the set statements that include the position only. For example input : dict = {'Fiora': {'Top': 1, 'Mid': 4, 'Bottom': 3},'Olaf': {'Top': 3, 'Mid': 2, 'Support': 4},'Yasuo': {'Mid': 2,...
Python: I am currently a function that will return a list of lists, with the frequency...
Python: I am currently a function that will return a list of lists, with the frequency and the zip code. However, I am having a difficult time organizing the list in decreasing order of frequency. We are asked to use sort () and sorted () instead of lambda. The function will find 5 digit zip codes, and when it sees a 9-digit zip code, the function needs to truncate the last 4 digits. The following is an example of the...
Write a python function that will return the total length of line that passes through any...
Write a python function that will return the total length of line that passes through any number of provided points ( (x,y) ). The points should be passed as individual tuples or lists. The function should also have a parameter (True or False) to indicate whether the line should start from the origin, and that parameter should default to False. If True, the returned value should include the distance from the origin to the first point, otherwise start adding distances...
Write a Python function with prototype “def anagramdictionary(wordlist):” that will return an “anagram dictionary” of the...
Write a Python function with prototype “def anagramdictionary(wordlist):” that will return an “anagram dictionary” of the given wordlist  An anagram dictionary has each word with the letters sorted alphabetically creating a “key”.
Please I need this to be done in Java, can I have it answered by anonymous...
Please I need this to be done in Java, can I have it answered by anonymous who answered my last question regarding java? Thank you You will need to implement a specific algorithm implementation to match the class definition AND implement a unit test using JUnit that conforms the specific naming convention. Algorithm: Inputs: integers m and n , assumes m and n are >= 1 Order is not important for this algorithm Local variables integers: remainder Initialize: No initialization...
what is the sec and their function s for investors?
what is the sec and their function s for investors?
Create four anonymous functions to represent the function 6e3cosx2, which is composed of
Create four anonymous functions to represent the function 6e3cosx2, which is composed of the functions h(z) = 6ez, g(y) = 3 cos y, and f(x) = x2. Use the anonymous functions to plot 6e3cosx2 over the range 0 ≤ x ≤ 4.
complete in python The function sumEven should return the sum of only the even numbers contained...
complete in python The function sumEven should return the sum of only the even numbers contained in the list, lst. Example list_of_nums = [1, 5, 4, 8, 5, 3, 2] x = sum_evens(list_of_nums) print(x) #prints 14 Start your code with def evens(lst):
Please convert this code written in Python to Java: import string import random #function to add...
Please convert this code written in Python to Java: import string import random #function to add letters def add_letters(number,phrase):    #variable to store encoded word    encode = ""       #for each letter in phrase    for s in phrase:        #adding each letter to encode        encode = encode + s        for i in range(number):            #adding specified number of random letters adding to encode            encode = encode +...
In Python write a function with prototype “def dictsort(d):” which will return a list of key-value...
In Python write a function with prototype “def dictsort(d):” which will return a list of key-value pairs of the dictionary as tuples (key, value), reverse sorted by value (highest first) and where multiple keys with the same value appear alphabetically (lowest first).
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT