In: Computer Science
In python,
Modify your mapper to count the number of occurrences of each character (including punctuation marks) in the file.
Practice the given tasks in Jupyter notebook first before running them on AWS. If your program fails, check out stderr log file for information about the error.
import sys
sys.path.append('.')
for line in sys.stdin:
line = line.strip() #trim spaces from beginning and
end
keys = line.split() #split line by space
for key in keys:
value = 1
print ("%s\t%d" % (key,value)) #for
each word generate 'word TAB 1' line
SOURCE CODE:
*Please follow the comments to better understand the code.
**Please look at the Screenshot below and use this code to copy-paste.
***The code in the below screenshot is neatly indented for better understanding.
You can simply copy-paste the below code in Jupyter notebook.
import sys
sys.path.append('.')
# Take an empty dictionary for counting
count_dict = {}
for line in sys.stdin:
line = line.strip() # trim spaces from beginning and end
keys = line.split() # split line by space
for key in keys: # Loop through each word
for ch in key: # Loop through each character in the word
if ch in count_dict: # Check if it already exists
value = count_dict[ch]
count_dict[ch] = value + 1
else: # Else,, create that character in the dictionary and add
count as 1
count_dict.update({ch: 1})
for ch, count in count_dict.items():
print('%s\t%d' % (ch, count))
===============================
SAMPLE INPUT:
hello , hi ! good ! morning!
bye!!
how are you?
I'am fine.!
CODE:
OUTPUT